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ABSTRACT 


This  report  contains  the  documentation  of  a  multiple  linear 
regression  program  for  up  to  50  independent  variables,  written  in 
FORTRAN  IV  for  the  IBM  7030  (STRETCH)  computer.  The  program 
incorporates  part  of  the  results  obtained  from  an  effort  to  explore 
the  present  limitations  of  high  speed  computation  in  the  area  of 
linear  statistical  models.  DA-MRCA  includes  options  for  both 
forward  and  backward  automatic  ranking  of  the  independent  variables 
by  order  of  prediction  power  for  the  dependent  variable.  The  report 
contains  the  description  of  these  options,  along  with  an  outline  of 
the  applicability  of  the  program  which  includes,  in  a  convenient 
form,  non-or thogonal  analysis  of  variance.  Justifications  are  given 
for  extensive  checks  made  on  the  accuracy  of  the  matrix  inversions. 

The  resulting  internal  decisions  and  their  effects  on  the  computational 
flow  are  described  in  detail.  Also,  a  failure  analysis  is  given  in 
which  causes  for  failures  to  obtain  acceptable  inverses  and  possible 
consequences  of  corrective  measures  are  discussed. 
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FOREWORD 


The  DA-MRCA  program  (Dahlgren  Multiple  Regression  Comprehensive 
Analysis)  documented  in  this  report  is  partially  based  on  the  TV-MRCA 
program  (Tennessee  Valley  Authority  Multiple  Regression  Comprehensive 
Analysis)  of  the  Tennessee  Valley  Authority,  The  TV-MRCA  program 
become  available  to  the  authors  through  the  SHARE  Program  Library. 
Although  much  larger  in  scope  and  applicability,  DA-MRCA  still  con¬ 
tains  some  computational  details  from  its  nucleus  routine,  TV-MRCA. 

(In  order  to  reflect  this  fact  the  initials  "MRCA"  have  been  retained 
for  the  present  program.)  TV-MRCA  included,  for  a  regression  model 
containing  up  to  23  independent  variables,  the  bases  for  the  features 
described  at  the  following  places  of  the  present  report:  Paragraph  C 
of  Section  VI.2.a.(l);  paragraphs  A-F  of  Section  VI. 2. a. (2)  (excluding 
all  references  to  ANOVA  tables,  the  final  comprehensive  analysis,  IVOR, 
and  BIVOR) ;  paragraphs  A,  B,  and  I  of  Section  VI. 2. a. (3);  and  Section 
VI. 2. a. (4)  (excluding  the  option  for  selected  input  design  points). 
These  features  were  applicable,  in  TV-MRCA,  to  the  main ‘run  and  to  hand 
selected  reruns.  The  first  additions  to  and  revisions  of  the  coding 
of  the  TV-MRCA  program  were  performed  by  Mr.  R.  Scanlon,  Mr.  D.  Green, 
and  Mrs.  Julia  Gray,  members  of  the  former  Scientific  Programming  and 
Analysis  Branch,  Computation  Division. 

The  work  reported  was  done  in  the  Mathematical  Statistics  Branch, 
Operations  Research  Division,  and  the  Operations  Sciences  Branch, 
Computer  Programming  Division,  with  Foundational  Research  Funds  No. 
29Y/R0110101/WR-6-7042  ("Computer  Programs  for  Statistical  Analyses  ')  . 

The  flow  charts  contained  in  the  present  documentation  were  drawn 
by  Messrs.  Thomas  B.  Yancey  and  John  S.  Darling  and  the  report  was 
typed  by  Miss  Judy  D.  Merryman. 

The  work  on  his  report  was  completed  on  26  March  1966. 


APPROVED  FOR  RELEASE: 
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Technical  Director 
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I .  INTRODUCTION 

The  need  for  a  capable  computer  routine  to  solve  extensive 
multiple  regression  problems  in  the  application  of  statistical  methods 
to  naval  ordnance  research  studies  and  other  Investigations  at  the 
Naval  Weapons  Laboratory  led  to  the  development  of  the  present 
DA-MRCA  program.  Connected  with  this  development  was  an  effort  to 
explore  the  present ' limitations  of  high  speed  computation  in  the  area 
of  linear  statistical  models.  The  program  incorporates  part  of  the 
results  obtained  from  this  research. 

DA-MRCA  has  served,  during  all  stages  of  its  development,  in 
the  solution  of  actual  statistical  problems  and,  also,  in  research 
studies  to  develop  more  advanced  and/or  specialized  computer  routines 
(to  be  documented)  for  statistical  analyses.  After  years  of  additions 
to  and  revisions  of  the  program  it  is  felt  that  DA-MRCA  has  reached  a 
desired  format  and  that  its  documentation  is  appropriate  at  this  time. 

The  DA-MRCA  program  is  written  in  FORTRAN  IV  for  the  IBM  7030 
(STRETCH)  computer  and  performs  all  the  usual  phases  of  a  multiple 
linear  regression  analysis,  that  is,  an  analysis  based  upon  the  model 

y  =:  30  +  3XXX  +  32x2  +  *••  +  3vxv  +  •••  +  3nXn  +  e  (1-1) 

where 

y  =  "dependent"  (random)  variable 

xv  =  "independent"  (non-random)  variables,  v  =  1,...,N 

3V  =  regression  coefficients,  v  =  1,...,N 

0o  =  a  constant 

e  =  "residual",  or  "error"  term:  a  random  variable  with 

expectation  zero  and  variance  a2,  usually  assumed  to  be 
normally  distributed. 

The  upper  limit  for  the  number  of  independent  variables  to  be  included 
in  the  model  is  N=50.  The  main  results  of  the  analysis  (based  on  a 
set  of  observed  x  and  y  values  and  obtained  by  the  principle  of  least 
squares)  are  the  estimates  of  the  regression  coefficients,  3v >  the 
constant,  P0 ,  and  the  residual  variance,  a2,  i.e.,  a  prediction 
formula  for  the  dependent  variable  and  a  measure  of  its  accuracy. 
Furthermore,  the  following  features  are  included  in  the  program: 
Computation  of  predicted  values  of  the  dependent  variable  at  selected 
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input  design  points  and/or  "synthetic"  design  points;  computation 
of  prediction  standard  deviations  for  the  construction  of  confidence 
or  tolerance  limits  at  selected  input  design  points  and/or  synthetic 
design  points;  a  listing  of  the  prediction  errors,  e;  a  bar-chart 
and  a  Chi-square  test  on  the  normality  of  these  errors;  computation 
of  the  standard  deviations  of  the  regression  coefficients;  printout 
of  the  full  inverse  of  the  matrix  of  the  normal  equations;  computation 
of  various  other  pertinent  statistics,  an  analysis-of-var iance  table, 
and  a  final  comprehensive  printout.  For  more  details  about  these 
features  see  Chapter  VI.  (It  should  be  noted  that  DA-MRCA  is  not 
capable  of  handling  more  than  one  dependent  variable  at  a  time. 

Neither  can  the  program  obtain  weighted  least  squares  solutions  nor  can 
it  fit  regression  models  through  the  origin.) 

Since  the  theoretical  aspects  of  the  normal  phases  of  multiple 
regression  analysis  form  a  well  established  part  of  mathematical 
statistics  (see,  for  example,  Anderson  and  Bancroft  [1952]),  these 
aspects  need  not  be  discussed  in  this  report. 

In  addition  to  the  "usual"  features,  the  program  has  three 
options  for  the  identification  of  the  significant  independent  variables 
These  options  are  discussed  in  more  detail  in  Chapter  III.  In  the 
first  option,  the  model  is  re-evaluated  on  the  basis  of  a  "hand" 
selected  subset  of  N'<N  independent  variables.  This  option  can  be 
used  to  test  the  null  hypothesis  on  any  specified  subset  of  N-N' 
regression  coefficients,  0V .  In  the  other  two  options  the  independent 
variables  are  automatically  ranked  by  order  of  prediction  power  for  the 
dependent  variable.  The  first  of  these  options  employs  the  "IVOR" 
routine  ("independent  Variable  Ordering  by  Regression  Sums  of  Squares") 
This  routine  uses  a  forward  or  "build-up"  technique  to  rank  the 
independent  variables  in  descending  order  of  importance.  The  second 
ranking  option  employs  "BIVOR"  ("Backward  Independent  Variable 
Ordering  by  Regression  Sums  of  Squares") .  This  routine  uses  a  reverse 
ordering  technique  by  which  the  independent  variables  are  ranked  in 
ascending  order  of  importance.  In  Chapter  III,  it  is  shown  that  the 
disturbing  effects  of  possibly  existing  "compounds"  (to  be  defined) 
upon  the  ranking  of  the  independent  variables  can  be  avoided  only  by 
application  of  the  BIVOR  technique.  Therefore,  the  BIVOR  option  is 
recommended  whenever  feasible.  There  are,  however,  situations  in 
which  the  IVOR  technique  has  its  advantages,  as  also  discussed  in 
Chapter  III. 

Essentially  all  of  the  "usual"  features  which  were  listed 
previously  are  also  applied,  or  can  optionally  be  applied,  in  the 
"reruns"  of  these  three  options  for  the  identification  of  the  signif-  • 
icant  independent  variables. 
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Also  built  into  the  program  are  extensive  checks  on  the  accuracy 
of  the  computations.  The  elements  of  the  calculated  identity  matrix 
are  checked  for  their  deviations  from  either  1  or  0,  and  internal 
decisions  are  made  with  respect  to  the  acceptance  of  the  matrix 
inversions  according  to  accuracy  requirements  imposed  by  the  program 
user.  The  details  of  these  checks  are  discussed  in  Sections  Vl.l.b. 
and  VI. 2. 

A  preprocessor  program  for  DA-MRCA,  MTRAN,  has  been  developed 
for  possible  transformations  of  observed  x  and  y  values  if  such  are 
necessary.  This  program,  however,  is  not  described  at  length  in  the 
present  report  but  is  covered  in  a  separate  documentation  (Herring 
[1966  1).  For  a  discussion  of  variable  transformations,  see  Sections 
II. 2.  and  VII. 2. a. 

The  various  chapters  of  this  report  are  directed  at  different 
types  of  readers.  Chapter  II  is  mainly  for  the  reader  who  wants  to 
be  informed  about  the  possible  applications  of  the  program.  No 
specialized  statistical,  mathematical  or  programming  knowledge  is 
required  for  understanding  this  chapter,  except  for  Section  II. 3, 
where  some  knowledge  of  analysis  of  variance  is  necessary.  (As  in 
Chapter  II,  programming  knowledge  is  not  required  for  reading  Chapters 
III  through  VII.)  Chapter  III  is  written  mainly  for  the  analyst 
seeking  information  about  the  theory,  techniques,  and  use  of  the 
three  model  re-evaluation  options  of  the  program,  especially  IVOR 
and  BIVOR.  (These  two  procedures  are  introduced  with  this  report.) 
Chapters  IV  and  V  define  the  terms  used  and  explain  the  input  prepa¬ 
ration  for  the  program,  respectively,  and  are,  therefore,  essential 
for  any  program  user.  Chapter  VI  is  written  for  the  analyst  who 
wants  information  on  the  computations  and  the  meaning  of  the  printouts. 
Program  running  time  formulae  and  art  example  problem  are  also  given  in 
this  chapter.  Chapter  VII  can  be  of  assistance  to  the  program  user  in 
case  of  a  failure  to  obtain  a  problem  solution.  Chapter  VIII  is  written 
for  the  programmer  and  for  the  programming-oriented  analyst.  This 
chapter  contains  the  FORTRAN  IV  documentation  of  DA-MRCA  (including 
flow  charts)  and  is  essential  for  program  changes  and/or  conversions. 

The  reader  will  notice  some  repetition  in  reading  the  report  as 
a  whole.  However,  the  report  is  intended  not  only  as  a  complete 
description  of  DA-MRCA,  but  also  as  a  direct  work  aid  in  which  case 
the  program  user  would  generally  refer  only  to  a  specific  chapter  or 
section  at  a  time.  Each  section  contains  all  the  necessary  information, 
often  given  in  the  form  of  references  to  other  sections. 
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II.  APPLICABILITY  OF  THE  PROGRAM 


In  this  chapter  the  various  types  of  problems  to  which  the 
DA-MRCA  program  can  be  appLied  are  discussed.  Some  general  state¬ 
ments  about  the  applicability  are  followed  by  sections  on  specific 
lypes  of  application. 

I-.l  General  Applicability 

The  DA-MRCA  program  is  applicable  to  all  problems  in  which 
'  preconceived  linear  mathematical  model  of  the  form 

Y  =  3o  +  Mix1  +  +  •  •  +  0VXV  +  +  3-;X\  ( IX- 1) 


is  to  be  evaluated  on  the  basis  of  n^+1  given  sets  of  values, 

[y;  xx  ,X;i , ,  .  .  ,xf< },  by  use  of  the  principle  of  least  squares.  Essentially 
this  evaluation  consists  of  solving  for  the  unknown  coefficients, 

Bv(v  -  0,1,... ,N)  and  attaching  a  measure  of  importance  to  the 
individual  variables,  xv ,  thereby  characterizing  their  "prediction 
power"  for  Y.  In  the  narrower  sense  of  multiple  linear  regression 
(n>N+l)  the  n  observations,  y,  of  the  "dependent"  variable  (random) 
are  expressed  in  the  terms  of  the  multiple  regression  model  (1-1)  , 

N 

y  =  Y  +  e  =  Sc  -  Z  3vxv  +  e, 

v=l 

where  the  xv  are  the  "independent  variables"  (v  -  1,...,N)  and  where 
e  is  a  random  variable  with  expectation  zero  and  variance  i"** .  (Note 
that,  the  regression  model  (1-1)  is  obtained  by  merely  adding  the  random 
variable  e  to  the  mathematical  model  (II-l) .)  Although  e  is  usually 
assumed  to  be  normally  distributed,  it  does  not  have  to  be  unless 
statistical  hypotheses  about  the  3V  are  to  be  tested,  or  confidence 
intervals  are  to  be  constructed. 

The  i1*1  set  of  observations,  [y;  Xj  ,x?  ,  .  .  .  ,x^  >,  is  defined  by 
the  coordinates  of  the  dependent  variable  and  the  N  independent 
variables  and  is  called  the  it^1  "data  point."  The  numerical  data  of 
a  given  regression  problem  is  comprised  of  n  such  data  points 
(i  -  l,...,n).  The  ith  set  of  coordinates  of  the  N  independent 
variables,  {x,  ,x;.  ,  .  .  .xfi  j,  ,  is  called  the  ich  "input  design  point." 

In  general,  there  is  no  restriction  concerning  the  relative  position 
of  the  input  design  points  except,  naturally,  in  the  case  of  linear 
dependencies  in  the  matrix  of  the  normal  equations.  (See  Section  VII. 2. b.) 
For  example,  the  design  points  do  not  have  to  define  a  complete 
rectangular  grid  in  the  N-dimensionaL  space,  a  situation  in  which 
orthogonal  polynomials  are  often  used.  The  application  of  these  does 
require  such  (orthogonal)  grids. 
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The  xv  values,  in  the  theory  of  multiple  regression,  are  assumed 
to  be  non-random,  that  is,  they  are  determined  at  the  will  of  the 
experimenter.  However,  in  a  more  general  interpretation,  they  may 
also  be  values  which  have  been  measured,  or  observed,  without 
appreciable  error.  Sometimes  multiple  regression  is  applied  in  such 
a  broad  sense  that  the  only  requirement  for  a  given  variable  being 
used  as  au  "independent"  variable,  is  the  assumption  of  a  cause- 
effect  relationship  between  the  variable  and  the  "dependent"  variable, 
y.  All  errors  originating  from  the  "independent"  variables  xv  are 
then  attributed,  by  definition,  to  the  variability  of  y,  and  the  xv 
are  again  considered  as  non-random  variables.  According  to  the 
definition  of  the  model  (1-1) ,  the  y  values  for  a  given  design  point 
are  assumed  to  be  randomly  and  independently  sampled  from  a  distri¬ 
bution  (usually  normal)  with  expectation 

N 

Y  —  3o  £  Bv  Xv 
v=l 

and  variance  a2 . 

With  the  above,  the  general  linear  multiple  regression  problem, 
to  which  DA-MRCA  is  applicable,  consists  of  fitting  a  least  squares 
surface  of  the  form  (IX-1)  to  n  observations  y ;  at  n  input  design 
points  (not  necessarily  all  distinct),  where  these  points  are  located 
in  the  N-dimensional  space  defined  by  the  N  independent  variables. 
Specifically,  the  program  serves  to  identify  those  independent 
variables  which  explain  a  significant  portion  of  the  variability  in 
the  numerical  values  of  y,  or,  in  other  words,  which  have  significant 
prediction  power  for  y.  One  possibility  to  arrive  at  this  identifi¬ 
cation  is  by  application  of  the  automatic  ranking  procedures  IVOR  and/or 
BIVOR.  IVOR  and  BIVOR  each  provide  for  the  ranking  of  all  N  independent 
variables  simultaneously,  or  for  ranking  independent  variables  within 
specified  groups.  A  second  possibility  to  identify  the  significant 
independent  variables  is  to  apply  the  option  for  "hand  selecting"  a 
specified  subset  of  independent  variables  to  be  deleted  from  the 
original  model,  and  then  test  the  contribution  of  these  deleted 
independent  variables  to  the  fit.  Also  possible  is  the  computation 
of  statistics  necessary  for  the  construction  of  confidence  intervals 
for  the  true  response  values  Y  at  the  input  design  points  and/or 
"synthetic"  design  points  located  within  the  original  experimental 
space . 


By  definition,  the  least  squares  fit  for  the  model  (1-1.)  reduces 
to  a  "perfect  fit"  when  the  number  nN(n>^n)  of  distinct  inpu'.  design 
points  in  the  N-dimensional  space  is  equal  to  N+l.  When  nN=n(=N+l), 
i.e.,  when  there  is  exactly  one  value  y.  at  each  distinct  design  point 
(the  surface  being  a  perfect  fit  to  each  individual  value  y, ,  i=l ,2 , . . . ,n) , 
the  fit  is  called  a  "zero-error  perfect  fit."  This  "non-statistical" 


5 


NWL  REPORT  NO.  2035 


or  "deterministic"  use  of  multiple  regression  is  also  possible  with 
DA-MRCA,  as  was  implied  in  the  statements  about  the  model  (II-l)  at 
the  beginning  of  this  section.  The  application  of  the  program  in 
this  case  is  discussed,  in  more  detail,  in  Section  II. 4. 

The  linearity  of  the  mathematical  model  ( II- 1)  depends  only 
on  the  linearity  of  the  unknown  parameters,  i.e.,  on  that  of  the  (3v's. 

The  general  linear  model,  consequently,  can  be  conceived  to  be  of 
various  forms,  each  of  which  can  be  fitted  by  DA-MRCA.  For  example, 
each  xy  can  be  a  (non-linear)  function  of  one  or  more  other  variables. 
Some  of  the  more  common  equations  of  linear  form  are  discussed  in 
Section  II. 2.  There  are  also  many  equations  that,  although  non-linear 
in  their  parameters,  can  be  made  linear  by  an  appropriate  transformation. 
The  use  of  DA-MRCA  in  fitting  this  type  of  equation  is  also  discussed 
in  the  next  section  (II. 2)  . 

In  order  to  solve  a  regression  problem  a  decision  must  be  made 
as  to  which  independent  variables  should  be  included  in  the  model  and 
in  which  functional  form  the  chosen  independent  variables  should  be 
included  in  the  model.  Helpful  in  this  decision  may  be  theoretical 
considerations,  previous  experience  with  the  variables,  a  plot  of  the 
data,  or  some  other^  means .  Of  particular  help  can  be  the  use  of  the 
program's  ranking  methods  IVOR  and  BIVOR.  These  methods  allow  the 
analyst  to  start  with  a  possibly  very  elaborate  model  (a  polynomial, 
in  general)  in  which  all  terms  having  in  reality  little  or  no 
prediction  power  for  the  dependent  variable,  y,  will  automatically 
be  identified. 

The  use  and  application  of  IVOR  and  BIVOR  are  explained  in 
detail,  together  with  the  discussion  of  the  theory  of  these  ranking 
procedures,  in  Section  III. 2.  There  it  is  shown  that  the  BIVOR  option 
should  be  used,  whenever  possible,  for  the  automatic  ranking  of  the  N 
independent  variables. 

II. 2  Specific  Linear  Models  and  Linearization 

The  most  straightforward  application  of  the  general  linear  model 

(II-D, 


Y  =  +  3lX;  +  S^X;  +  •••  +  3v*v  +  *“  +  0NXN» 

occurs  when  all  N  variables,  xv  ,  represent  the  first  powers  of 
original  observed  independent  variables.  In  the  example  case  given 
in  Section  VI. 5,  where  the  dependence  of  y  =  Ballistic  Limit  (of 
projectile)  upon  Thickness  and  Hardness  (of  target  plate)  is  analyzed, 
such  a  straightforward  model  would  include  only  the  two  original 
independent  variable^  Thickness  (xj  ,  say)  and  Hardness  (xfc ,  say),  and 
would,  therefore,  have  the  form: 
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Y  =  P0  +  3xxj  +  3-x-.  . 

As  indicated  before,  however,  the  xv  can  also  represent  functions  of 
the  form 


Xy  —  f y  {zy  ^  )  •  •  •  >  Zy  t  ,  .  .  .  }  ,  ( XI  “2) 

where  these  functions  do  not  contain  parameters  to  be  estimated  and 
where  the  zv  are  variables  (assumed  to  be  non-random)  whose  observed 
numerical  values  completely  specify  the  numerical  value  of  xv .  The 
simplest  example  of  such  functions  are  the  polynomial  terms  xv=zv  of 
a  single  original  independent  variable,  z.  A  model  containing  only 
these  terms  would  appear  as 

Y  -  Po  +  3iZ  +  3s  zs  +  •*•  +  3vzv  +  •••  +  3nzn  , 

that  is,  as  the  equation  of  an  degree  polynomial  in  one  variable. 
More  generally,  the  xv  can  represent  polynomial  terms  in  several 
original  independent  variables,  z,  .  This  implies  the  applicability 
of  DA-MRCA  in  the  important  area  of  multivariate  polynomial  fitting 
with  up  to  N=50  polynomial  terms,  including  the  linear  terms.  The 
data  handling  in  this  case  is  very  simple  because  the  numerical  values 
of  the  polynomial  terms  can  be  automatically  generated  by  the  program. 
The  program  user  merely  specifies  which  polynomial  terms  are  to  be 
included  in  the  model  and  writes  as  input  only  the  numerical  values  of 
the  original  independent  variables,  zi .  From  these,  the  values  of  the 
terms  of  higher  than  first  order  are  automatically  generated  and 
internally  used  as  input  for  the  generation  of  the  matrix  of  the  normal 
equations.  As  is  true  for  any  type  of  independent  variable,  xv ,  the 
use  of  the  options  for  hand  selected  reruns  or  for  IVOR  and/or  BIVOR 
will  provide  the  analyst  with  the  desired  information  concerning  the 
necessary  degree  of  the  polynomial  needed  in  the  fit.  This  enables 
the  program  user  to  maximize  the  "goodness  of  fit",  provided  that 
he  starts  with  a  polynomial  equation  of  high  enough  degree  in  all 
original  independent  variables.  IVOR  and  BIVOR  will  automatically 
rank  the  polynomial  terms  according  to  their  prediction  power  for  y 
and  thus  provide  the  analyst  with  a  basis  for  choosing  a  "significant 
model."  To  illustrate  this  with  the  example  of  Section  VI. 5,  the 
analyst  might  have  assumed  that  the  polynomial  in  xi  =  zx  -  Thickness 
and  xa  =  Zy  =  Hardness  would  not  have  to  be  of  higher  than  the  second 
degree  in  order  to  predict  the  Ballistic  Limit,  y,  sufficiently  well. 
Accordingly,  he  would  enter  the  program  with  the  model 

Y  =  3o  +  BiZi  +  3az2  +  33Zi  +  34ZiZt-  +  3s  z^ . 


7 


NWL  REPORT  NO.  2035 


Numerical  input  would  be  (besides  y)  only  and  xL=zL,  whereas 
x.)=zf,  X4=z1zy,  and  Xc-zf  would  be  generated  by  the  program.  The 
application  of  BIVOR,  say,  might  yield  as  the  "significant  model" 
(using  the  symbols  Y  and  bv  for  the  estimated  parameters): 


Y  =  br  +  byZy  +  b4  Z-.;  . 

Here,  it  is  implied  that  BIVOR  ranked  the  iriables  z-- ,  Zy ,  and  z?as 
the  least  important  ones  and  that  their  contribution  to  the  fit  was 
found  to  be  nonsignificant  according  tc  a  prechosen  significance  level. 

As  indicated  before,  both  IVOR  and  BIVOR  contain  an  option  for 
grouping  the  independent  variables  such  that  the  ranking  process  takes 
place  within  only  one  group  at  a  time.  (For  more  details  see  Sections 

VI. l.d  and  Vl.l.e.)  This  grouping  can  be  applied  to  the  case  of  poly¬ 
nomial  terms  such  that  terms  of  equal  degree,  for  example,  will  be 
ranked  exclusively  among  themselves.  The  reader  is  referred  to  Section 

VII.  2. a  for  an  important  application  of  this  feature  in  connection  with 
using  transformed  variables  to  increase  the  computational  accuracy  when 
fitting  polynomials. 

Although  polynomial  terms  are  the  most  frequently  occurring  type 
of  functions,  fv ,  in  formula  (II-2),  functions  other  than  polynomials 
can  as  well  be  represented  by  the  xv  .  Examples  are  xv  =zv  1  sin(zv .,)  , 
xv*v  Zv^Zv.  >  xv  log  z  v,  etc.  In  particular,  such  functions  will  occur 
when  linearization  of  the  given  (non-linear)  model  must  be  achieved 
by  transformations. 

Although  the  method  of  least  squares  may  also  be  applied  to 
non-linear  models,  the  normal  equations  which  result  are  non-linear 
in  the  parameters  and  generally  must  be  solved  by  iterative  methods. 
DA-MRCA  is  not  capable  of  fitting  such  equations,  but  some  of  the 
non-linear  equations  can  be  evaluated  after  performing  the  appropriate 
transformation  that  leads  to  the  necessary  linear  form.  Suppose,  for 
example,  the  analyst  wishes  to  consider  the  non-linear  equation 

Y*  -  So(Zi)61 

as  the  model.  (The  asterisks  are  used  for  distinction  of  the  terms 
of  the  non-linear  model  from  those  of  the  linear  model.)  A  simple 
transformation  to  either  common  or  natural  logarithms  will  result  in 
the  linear  equation 


log  Y*  -  log  3c  +  3*  log  Zy 
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which  is  identical  to  the  general  linear  model  if  one  lets  log  Y*-Y  of 
the  linear  model,  log  3c:~3  and  log  z,=x,  .  In  this  case,  therefore, 
the  logarithms  of  the  values  of  both  the  dependent  and  independent 
variables  must  be  used  as  input  to  the  program.  The  resulting  least 
squares  equation  can  be  retransformed  into  the  original  fr"-m  by 
substituting  the  antilog  of  the  estimated  coefficient  log  (3*;  for 
b*  in  the  original  equation  as  expressed  in  estimated  terms: 


bi 


Another  example  of  a  non-linear  model  that  can  be  linearized  by  a 
logarithmic  transformation  is 


Y*  =  3*(3?)Zl(3?)"2  . 


This  will  lead  to 

log  Y*  =  log  9*  +  ( log  8t)zi  +  (log  g*)zs. 

With  log  Y*  =  Y,  z1  -  xx ,  z;  =  x^ ,  in  this  case,  the  logarithms  of  only 
the  values  of  the  dependent  variable  have  to  be  used  as  input. 

It  should  be  noted  that,  whenever  a  transformation  is  used  to 
linearize  an  equation,  it  is  the  sum  of  squares  of  deviations  on  the 
transformed  variables  that  is  minimized  and  not  the  sum  of  squares  on 
the  original  variables.  This  has  consequences  in  the  use  of  the  results 
from  DA-MRCA:  point  and  interval  estimation  must  be  done  based  on  the 
calculations  for  the  transformed  variables.  Only  after  the  predicted 
values  and/or  confidence  limits  have  been  computed,  will  they  be 
re -transformed  into  the  original  scale  of  the  non-linear  model.  As 
a  result  one  obtains,  for  example,  non-symmetric  confidence  limits 
about  the  Y  values. 

Often  it  is  necessary  to  apply  a  transformation  only  on  the 
dependent  variable  in  order  to  achieve  a  normal  (or  near-normal) 
distribution  for  y  as  is  desired  in  many  cases.  (The  built-in 
Chi-square  test  on  the  normality  of  the  residuals,  e,  may  give  an 
indication  for  the  necessity  and  type  of  such  a  transformation.  See 
Section  VI.l.c.)  Another  reason  for  transforming  y  only  could  be  to 
stabilize  the  variance  which  might  be  a  function  of  the  coordinates, 
xv ,  of  the  design  points.  It  is  a  known  fact,  however,  that  in  many 
cases  in  which  a  transformation  of  the  y  values  is  appropriate  for 
either  of  these  two  reasons,  it  is  also  necessary  for  the  other  one. 

In  addition  to  this,  experience  has  shown  that  when  the  experimental 
data  indicates  the  necessity  of  a  transformation  for  normalizing  the 
y  values  and/or  for  stabilizing  their  variance,  often  this  is  the 
only  transformation  which  also  linearizes  the  functional  relationship 
between  Y  and  the  x's.  For  example,  in  the  model  Y*  a  BoCB*)*1,  the 
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observations  y*  of  the  dependent  variable  will  usually  not  be  distributed 
normally,  but  the  values  of  y  =  log  y*  =  log  @q  +  (log  3i)Zi  +  e  often 
will  be. 


Because  of  the  importance  of  the  various  transformations  it  is 
repeated  here  that  the  preprocessing  program  MTRAN  ("DA -MCA  Trans  - 
formation",  see  Herring  [1966])  is  available  for  use  in  conjunction 
with  DA-MRCA.  This  program  can  perform  the  following  transformations 
on  the  values  of  the  dependent  variable,  the  independent  variable(s), 
or  on  the  values  of  both  types  of  variables: 


In  (A+x)  *) 
In  [B+ln  (C+x)]  *) 
/x 


1 

D+x 

Sin'1  x 

2  Sin”1  /x 

sin  x 

cos  x 

x 

E 


*) 


*) 


*)  The  constants  A,  B,  C,  D,  E  are  to  be  specified  by  the 
analyst . 

**)  This  transformation  is  only  for  the  independent  variables. 
The  purpose  is  to  increase  the  matrix  inversion  accuracy.  For  details 
see  Section  VII. 2. a. 


10 


NWL  REPORT  NO.  2035 


II. 3  Non-Or thogonal  Analysis  of  Variance  and  Covariance 

DA-MRCA,  being  a  program  for  general  multiple  linear  regression, 
can  naturally  also  be  applied  to  analysis  of  variance  and  covariance 
models,  in  particular  to  data  classifications  with  incomplete  and/or 
unbalanced  data  (non-orthogonal  ANOVA)  .  For  the  general  discussion  of 
the  multiple  regression  treatment  of  non-orthogonal  analysis  of  variance, 
see  Brownlee  [i960]. 

As  an  example  of  the  application  of  DA-MRCA  to  non-orthogonal 
analysis  of  variance,  a  2x3  crossed  classification  with  qualitative 
factors  and  with  unequal  (and  non-proportional)  cell  numbers  is  treated. 

The  two  factors  of  the  example  are  denoted  as  (7  and  B,  and  the 
analysis  of  variance  model  is: 

YaBp  =  Yae  +  eagp  =  p.  +  aa  +  be  +  abag  +  eaep. 

The  various  terms  have  the  following  meaning: 

yagp  =  pth  observation  in  cell  "q,3"  of  the  response  variable 
K  (random) ,  where 

p  =  1 ,  . . .  »Rq3 

01  as  1,  .  .  .  ,A 

3—  1 ,  • . .  ,B 

with  Rag  being  the  number  of  observations  in  cell  "o3" 
and  with  A  and  B  being  the  numbers  of  levels  in  factors  (7 
and  B ,  respectively  (A-2  and  B=3  in  the  present  example); 

Ya0  <b  expected  or  true  value  of  the  response  variable  y  in 
cell  Mar3"; 

u  ■  general  constant; 

a a  *  constant  for  level  »  of  factor  G\ 

bg  ■>  constant  for  level  3  of  factor  B\ 

abQg  ■  interaction  constant  for  level  combination  3$; 

eagp  -  error  term,  assumed  to  be  normally  independently 
distributed  with  expectation  zero  and  variance 

In  the  multiple  regression  approach  to  this  case  of  only 
qualitative  factors  the  model  constants  (in  the  example:  4a ,  bp , 

and  abag)  become  the  regression  coefficients  of  auxiliary  independent 
variables  which  take  on  only  the  values  1  and  0,  as  will  be  demonstr  Led 


NWL  REPORT  NO,  2035 


below.  For  the  inversion  of  the  matrix  of  the  normal  equations,  linear 
restrictions  have  to  be  imposed  on  the  estimates  of  the  various  sets  of 
constants,  reducing  the  number  of  constants  in  each  set  to  the  number 
of  degrees  of  freedom  available  in  each  corresponding  factorial  effect. 
For  example,  there  are  A  main  effect  constants  aa  in  factor  c7,  but 
only  A-l  degrees  of  freedom  are  available  in  the  main  effect  of  '7. 

Since  in  non-orthogonal  analysis  of  variance  for  qualitative  factors, 
the  estimates  of  only  the  contrasts  between  model  constants  are 
meaningful  rather  than  the  estimates  of  the  constants  themselves  (see, 
for  example;,  Graybill  [196l],  Chapter  13),  the  choice  of  the  type  of 
linear  restrictions  imposed  on  the  estimates  of  the  model  constants  is 
arbitrary.  For  the  ease  of  computation,  a  good  choice  is  to  let  the 
last  constant  in  each  set  be  equal  to  zero.  Applied  to  the  present 
example,  this  means: 

a  i  ~  By  s  sI)q  u  —  ab;.  c  *  0  j  o'  —  1 ,  *  * .  ,A  j  3  —  1,.*.,B. 

The  model  of  the  example  can  be  written  (using  the  notation  for  the 
estimates  which  are  in  reality  only  to  be  found  later  by  least 
squares) : 


YaB  -  +  aix*  f  6;X.  *  6,  x-.  +  ab, , x4  +  abu  x  . 

In  this  equation,  x  is  a  dummy  variable  always  taking  the  value  1  and 
the  x,  ,  i-l,...,5,  are  the  above  mentioned  auxiliary  variables. 


Each  of  the  6  cells  then  leads  to  an  equation  of  the  above  form 
for  each  of  the  corresponding  Ra£  observations,  giving  altogether 

2  3 

-  Ret  s  "  R  •  ♦ 

•  •  l  Si 

input  design  points  for  the  multiple  regression  approach: 


-  *  a,  -1  - 

b  •  i 

* 

-  •  1  1  at  ’  1  ♦ 

-0 

A  A 

.1  a,  *1  • 

b.  *0 

Y.  . 

A  A 

*  •  l  *  a;  »0  ; 

■\ 

b  .1 

*  «‘l  +  «; ’0  » 

A 

b,  -0 

A 

A  A 

A 

Y  , 

*  u*l  *  a,  *0  ■ 

b  *0 

6  ‘0  r  ab, ,  *1  *  ab  ;  *0 

b  *1  t  abn  ‘0  +  ab,;  *1 

b  *0  f  aV  ‘0  +  ab>;  *0 

b  .0  *  sb| ; *0  *  ab> .  *0 

6:  *1  »  abv ,  *0  abu  ’0 
b;  *0  *  ab-, ,  *0  *  ab-  *0 


In  this  example,  the  numerical  values  of  the  auxiliary  "independent*1 
variables  associated  with  the  interaction  terms,  x.  and  x.  ,  can  be  seen 
to  be  the  products  of  the  values  of  the  auxiliary  independent  variables 
associated  with  the  two  appropriate  main  effects,  x  and  x,  ,  and  x  and 
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x. ,  respectively.  This  "product  rule"  applies  correspondingly  also  to 
all  crossed  classification  models  containing  higher  order  interactions, 
which  simplifies  greatly  the  input  writing  for  non-or thogonal  analysis 
of  variance  and  covariance  for  qualitative  factors:  only  the  l*s  and 
0's  of  the  auxiliary  variables  for  the  main  effects  need  be  input.  The 
numerical  values  of  the  interaction  variables  are  generated  by  the 
program  as  products  according  to  the  specifications  put  on  the  appropriate 
control  card.  (For  details,  see  Section  V.2,  Card  Type  3.) 

With  the  design  matrix  thus  generated,  the  least  squares  procedure 
yields  the  model  estimates,  or  "regression  coefficients",  4,  a:  ,  bj ,  b^, , 
ab. • ,  and  ab,;  .  Also,  the  sum  of  squares  between  cells  or  "total 
regression"  sum  of  squares  is  given.  By  the  hand  re-evaluation  option 
of  DA-MRCA,  null -hypotheses  concerning  the  various  factorial  effects 
can  be  tested.  However,  it  is  not  recommended  to  test  a  null  hypothesis 
on  the  main  effects  3  or  B  as  long  as  the  interaction  38  is  present  in 
the  model.  The  reason  is  that  the  additional  regression  sum  of  squares 
due  to  3  or  8}  or,  more  specifically,  due  to  the  auxiliary  variables 
xa ,  or  x;.  and  x.< ,  associated  with  3  or  8,  respectively,  is  dependent 
upon  the  arbitrary  restrictions  imposed  on  the  model  constants  as  long 
as  x^  and  x  are  present  in  the  model.  (See  Scheffe  [19591,  p.  117.) 

The  additional  regression  sums  of  squares  due  to  3  or  8  become 
independent  of  the  arbitrary  restrictions  only  when  the  auxiliary 
variables  x4  and  x.  of  the  interaction  38  are  deleted  from  the  model. 
Therefore,  the  recommended  sequence  of  testing  in  the  present  example 
is  to  first  delete  simultaneously  x,  and  x~  (thereby  obtaining  the 
additional  regression  sum  of  squares  due  to  38) ,  and  then,  to  delete 
the  independent  variables  associated  with  both  38  and  3  or  both  <35 
and  /?,  provided  the  interaction  <28  is  not  significant.  This  type  of 
procedure  will  be  referred  to  as  testing  under  "restricted  admissibility", 
i.e.,  initially  only  38  is  "admissible"  for  testing  but  <2  and  8  are  not. 

In  order  to  illustrate  the  application  of  DA-MRCA  to  non -or thogonal 
analysis  of  covariance  one  merely  would  have  to  add  covariates  to  the 
above  ANOVA  model  of  the  2x3  crossed  classification  example.  The 
covariates  become  part  of  the  model  for  all  calculations  and  remain  part 
of  it  during  the  testing  of  any  specified  null  hypothesis  concerning  the 
factorial  effects. 

Since  the  DA-MRCA  program  can  handle  up  to  N^SO  independent 
variables,  50  Is  also  the  upper  limit  for  the  number  of  degrees  of 
freedom  for  factorial  effects  to  be  included  in  non-orthogonal  analysis 
of  variance  models.  In  non-orthogonal  analysis  of  covariance  this 
upper  limit  of  the  degrees  of  freedom  for  factorial  effects  Is  reduced 
by  the  number  of  covarietes  included  in  the  model. 

Since,  in  general,  individual  factorial  effects  will  have  more 
than  one  degree  of  freedom,  the  automatic  ranking  procedures  IVOR  and 
3 IVOR  generally  cannot  be  applied  for  the  ranking  by  significance  of 
factorial  effects.  In  cases  of  only  single -degree -of -freedom  quali¬ 
tative  effects,  however,  this  application  is  possible.  For  testing 
under  "restricted  admissibility"  as  discussed  before,  the  single -degree - 
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of-freedom  effects  must  be  grouped,  in  DA-MRCA,  according  to  their  order 
i.e.,  main  effects  first,  then  2-factor  interactions,  then  3-factor 
interactions,  etc.  Since  the  ranking  is  done  within  only  one  group 
at  a  time,  this  application  of  BIVOR  (or  IVOR)  guarantees  the 
restricted  admissibility  of  the  effects  for  testing,  although  in  an 
overstrict  manner.  For  example,  in  a  2x2x2  factorial  classification, 
the  one-degree  of  freedom  effects  would  be  grouped  as  follows.  Group 
1:  <"7,  £,  C;  Group  2:  dSt  6C,  /33;  Group  3:  CBl.  BIVOR  would  delete 

(73Z-  first,  then  rank  C&,  and  S3,  and  finally  (after  deletion  of 
both  the  third  and  second  group)  rank  C?y  /9,  and  C. 

Note:  Work  is  presently  in  progress  on  the  documentation  of 
NOVACOM,  a  FORTRAN  IV  program  for  "Non-Orthogonal  Variance  and  Covariance 
Analysis  by  Multiple  Regression"  which  is  able  to  automatically  rank 
multiple-degree-of-freedom  factorial  effects  under  restricted  admissi¬ 
bility.  NOVACOM  is  based  on  the  ideas  that  were  indicated  in  this 
section  and,  in  addition,  on  some  of  the  suggestions  contained  in  Abt 
.[1965]. 


II. 4  "Non-Statistical"  Applications  of  DA-MRCA 

As  already  mentioned  in  Section  1  of  this  chapter,  DA-MRCA  also 
provides  for  the  possibility  of  "zero-error  perfect  fits."  These 
were  defined  to  be  "perfect  fits"  (nv-N+l)  in  which  there  is  exactly 
one  y  value  at  each  of  the  n,4-*n  distinct  design  points.  Since  in 
these  cases  the  "error",  or  the  residual  variance,  is  zero,  the 
essential  element  of  statistics  is  absent.  Consequently,  there  is 
no  possibility  to  apply  statistical  tests  or  to  perform  interval 
estimation. 

The  least  squares  method  degenerates  to  the  solution  of  a  system 
of  tM  linear  equations  of  rank  N^i,  having  as  a  solution  the  perfect 
fit.  Such  a  zero-error  perfect  fit  has  one  of  its  many  aoplications  as 
an  interpolation  formula.  Since  IVOR  and  BIVOR  arc  independent  of  the 
existence  of  an  error  term,  they  both  can  be  applied  in  the  case  where 
the  pre -conceived  model  (i.e.,  the  model  with  the  N  independent  variables 
oi  the  "main  run")  is  a  zero-error  perfect  fit.  The  subsequent  independent 
variable  selections  by  IVOR  or  BIVOR  will  give  (least  squares)  inter¬ 
polation  fits  of  monotonically  changing  overall  accuracy.  From  these 
the  analyst  can  choose  the  model  which  satisfies  his  accuracy  require¬ 
ments  with  respect  to  the  prediction  of  the  original  values  of  the 
response  variable,  Y  This  technique  is  sometimes  very  useful  when  a 
closed  expression  of  sufficient  accuracy  is  to  be  found  for  the  entries 
of  a  table  of  values. 
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III.  THE  IDENTIFICATION  OF  SIGNIFICANT  INDEPENDENT  VARIABLES 

1II.1  Testing  A  Specified  Null  Hypothesis  by  the  Main  Theorem 

The  testing  of  a  linear  hypothesis  concerning  the  contribution 
of  any  specified  subset  of  N-N*  independent  variables  to  the  regression 
sum  of  squares  due  to  N  independent  variables  is  made  possible  by  a 
model  re-evaluation  option  of  the  program.  The  test  is  based  on  what 
may  be  called  the  Main  Theorem  of  Multiple  Regression.  The  content  of 
this  theorem,  see,  for  example,  Anderson  and  Bancroft  1952],  p.  172, 
is  as  follows: 

In  the  general  linear  model  (1-1), 

N 

y  -  E  Bvxv  +  e, 

v-0 

the  residuals,  e,  are  assumed  to  be  normally  independently  distributed 
with  expectation  zero  and  variance  ? ‘ .  Then,  under  Ho{PVl  =  Bv:  -  ••• 

=  =  0},  where  {3-...,  B-.  ^ ,  ....  9V ...  »  are  the  regression 

coefficients  of  the  N-N ' ‘ independent  variables  whose  contribution  to 
the  regression  sum  of  squares  is  to  be  tested,  the  variance  ratio 


F. 


SS 

N-N’ 


/ATSS  -  ASSR  : 
n-N-l 


(in-D 


is  distributed  as  F  with  N-N'  and  n-N-l  degrees  of  freedom.  The  terms 
in  this  formula  are  defined  as  follows: 


ASSR>,  ■  "total"  regression  sum  of  squares  (adjusted  for  the  mean), 
with  N  degrees  of  freedom,  due  to  all  N  independent 
variables; 

«  ASSRn  -  ASSR.*  ■« "additional  regression  sum  of  squares", 
with  N-N'  degrees  of  freedom,  due  to  the  specified 
subset  of  N-N*  independent  variables,  where  ASSR>.«  is 
the  regression  sum  of  squares  (adjusted  for  the  mean) 
due  to  the  N'  independent  variables  left  in  the  model 
after  deleting  the  N-N*  independent  variables  whose 
contribution  to  the  fit  Is  to  be  tested; 


ATSS 


Z  (y* -y)s  *  total  sum  of  squares  (of  y)  adjusted  for 
Ul 

the  mean,  with  n-l  degrees  of  freedom; 


n  m  total  number  of  observed  y  values. 
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When  using  the  model  re-evaluation  option,  the  analyst  merely 
specifies  the  N-N*  independent  variables,  whose  contribution  to  the 
regression  sum  of  squares  is  to  be  tested,  by  indicating  the  comple¬ 
mentary  N'  independent  variables  for  which  the  program  will  make  a 
"rerun."  The  specified  set  of  the  N'  independent  variables  in  a 
particular  rerun  of  this  option  is  called  a  "Hand  Selection"  of 
independent  variables  in  order  to  distinguish  it  from  a  set  auto¬ 
matically  arrived  at  in  any  rerun  of  IVOR  or  BIVOR. 

The  F  ratios  (III-l)  are  computed  and  listed  for  all  specified 

reruns  in  a  "final  comprehensive  analysis  table," 

* 

III. 2  Ranking  by  IVOR  and  BIVOR 

The  subroutines  IVOR  and  BIVOR  for  the  automatic  ranking  of  the 
independent  variables  by  order  of  importance  are  also  based  on  the  Main 
Theorem.  The  routines  may  serve  to  separate  the  non-significant 
independent  variables  from  the  significant  ones  (or  to  find  a  "signifi¬ 
cant  model")  according  to  the  F  ratio  (III-l)  which  is  computed  at  each 
step.  IVOR  and  BIVOR  are  particularly  useful  when  the  analyst  knows 
nothing  about  the  relative  importance  of  the  N  IV's,  or  when  the 
program  user  wants  to  confirm  earlier  results  with  new  sets  of  input 
data. 


The  ranking  of  the  independent  variaoles  in  IVOR  and  BIVOR  is 
done  according  to  their  prediction  power  for  the  dependent  variable. 

This  prediction  power  is  measured  by  the  additional  regression  sum  of 
squares,  SSn_n»  ,  (from  the  Main  Theorem)  which  is  due  to  the  independent 
variables  in  question.  It  is  possible  to  use,  as  ranking  criterion,  the 
additional  regression  sum  of  squares,  or  its  complementary  value,  ASSRn*, 
since  the  associated  degrees  of  freedom  are  equal  for  each  ind  pendent 
variable  to  be  ranked.  Therefore,  the  F  test  of  the  Main  Theorem, 
within  each  step,  has  equal  power  with  respect  to  degrees  of  freedom 
for  each  independent  variable  to  be  rarked. 

The  rankings  proceed  as  follows: 

In  IVOR,  a  forward  ranking  proces -  is  executed,  which,  at  the 
first  step,  searches  among  all  N  independent  variables  for  the  one  which 
yields  the  largest  value  ASSR«> =  ASSRX .  This  is  the  one  independent 
variable  among  the  N  which,  when  it  is  the  only  one  included  in  the 
model,  explains  the  largest  portion  of  the  total  regression  sum  of 
squares,  ASSR*.  In  the  second  step,  IVOR  searches  for  that  pair  of 
independent  variables,  consisting  of  the  independent  variable  ranked 
most  important  in  the  first  step,  plus  one  of  the  remaining  N-l 
independent  variables,  which  yields  the  largest  value  ASSR**  =  ASSRa . 
This  is  continued  through  step  number  N-l,  at  the  end  of  which  the  first 
N-l  most  important  independent  variables  will  have  been  ranked.  The 
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least  important  independent  variable  (Number  N)  is,  thereby,  determined 
automatically.  Obviously,  this  ranking  procedure  results  in  a  descending 
order  of  importance  of  the  independent  variables. 

In  BIVOR,  a  reverse  ranking  process  is  executed,  which,  at  the 
first  step,  searches  among  all  N  independent  variables  for  the  one 
which  yields  the  smallest  value  SSN_N*=  =  SSj .  This  is  the 

independent  variable  among  the  N  which,  when  deleted  from  the  model, 
gives  the  smallest  additional  regression  sum  of  squares.  In  the  second 
step,  BIVOR  searches  for  that  pair  of  independent  variables,  consisting 
of  the  independent  variable  ranked  least  important  in  the  first  step 
plus  one  of  the  remaining  N-l  independent  variables,  which  yields  the 
smallest  value  SSN_N»  =  SSn_(N_2\  =  SS2  .  This  is  continued  through 
step  number  N~l,  at  the  end  of  which  the  N-l  least  important  independent 
variables  will  have  been  ranked.  The  most  important  independent 
variable  (Number  N)  is,  thereby,  determined  automatically.  As  can  be 
seen,  the  BIVOR  ranking  procedure  results  in  an  ascending  order  of 
importance  of  the  independent  variables. 

In  both  IVOR  and  BIVOR  the  independent  variables  can  optionally 
be  grouped  such  that  the  ranking  process  is  performed  within  only  one 
group  at  a  time.  For  details  and  for  an  application  of  the  grouping 
feature  as  a  device  to  save  computing  time,  see  Sections  Vl.l.d  and 
Vl.l.e;  for  other  applications  see  Sections  II. 3  and  VII. 2. a. 

As  indicated  earlier,  the  ranking  of  independent  variables  by 
their  prediction  power  in  both  IVOR  and  BIVOR  is  mainly  a  means  of 
identifying  those  IV's  (independent  variables)  which  have  a  significant 
prediction  power  for  the  dependent  variable.  In  addition  to  this,  the 
rankings  give  the  experimenter  an  indication  of  the  relative  importance 
of  the  IV's,  and  these  rankings  sometimes  are  valuable  in  their  own 
right.  Generally,  however,  the  goal  to  be  achieved  with  such  rankings 
is  to  determine  a  "significant  model"  containing  a  minimum  number  of 
IV’s  with  maximum  prediction  power  for  the  dependent  variable.  It  is 
emphasized  that,  for  this  goal,  the  rankings  as  done  by  IVOR  and  BIVOR 
are  not  ideal  but  are  feasible  and  considered  to  be  adequate.  (For  a 
discussion  of  the  "ideal  method"  see  Section  III. 3.) 

It  is  important  to  note  that  an  independent  variable  which,  by 
itself,  has  a  large  prediction  power  for  y  might  not  appear  to  have 
such  in  the  ranking  by  IVOR  or  BIVOR.  This  could  happen,  for  example, 
for  one  of  two  correlated  (possibly  highly)  independent  variables  when 
both  of  them  individually  have  considerable  prediction  power  for  y. 

Both  IVOR  and  BIVOR  would  put  the  one  independent  variable  of  the  two 
which  has  the  higher  (possibly  only  slightly)  prediction  power  into  the 
group  of  important  independent  variables  and  might  rank  the  second  one 
as  being  unimportant.  Accordingly,  this  second  independent  variable 
may  then  appear  to  have  little  or  no  prediction  power.  It  must  be 
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recalled,  however,  that  the  prediction  power  of  an  independent 
variable,  as  defined  here,  is  the  additional  prediction  power  in 
excess  of  that  of  the  other  independent  variables  already  contained 
in  the  model.  By  itself,  the  second  independent  variable  may  be  very 
important,  but  in  combination  with  the  first  one  it  loses  all  its 
significance.  Thus  the  ranking  order,  as  established  by  IVOR  or 
BIVOR,  must  be  viewed  under  the  aspect  of  the  strictly  prediction- 
power  -oriented  character  of  the  ranking  processes. 

One  might  expect  that  IVOR  and  BIVOR  will  yield  the  same 
ranking  order  of  the  independent  variables.  However,  this  is,  in 
general,  not  the  case.  One  reason  for  this  difference  is  the  possible 
existence,  in  the  data  of  a  regression  problem,  of  a  so-called 
"compound"  which  has  been  defined  in  Abt  [1965].  In  brief,  a  "compound" 
is  comprised  of  a  set  of  Kf^N  indegendent  variables  plus  the  dependent 
variable  when  the  error  variance  o  associated  with  all  S  independent 
variables  is  smaller,  by  orders  of  magnitude,  than  the  error  variance 
associated  with  any  subset  of  fif-1  independent  variables,  i.e.,  after 
any  single  independent  variable  has  been  excluded  from  the  set  of  8 
independent  variables  comprising  the  compound  together  with  y. 

The  effect  of  the  existence  of  a  compound  upon  the  ranking  of 
independent  variables  is  such  that  in  the  forward  procedure  (as 
executed  by  IVOR)  an  independent  variable  which  does  not  belong  to 
the  compound  might  be  ranked  as  most  important  and  possibly  as 
significant,  whereas  in  the  reverse  procedure  (as  executed  by  BIVOR), 
this  same  independent  variable  might  be  ranked  as  least  important 
and  possibly  as  non-significant.  The  explanation  is  that  in  reverse 
ranking  (BIVOR)  the  unity  of  the  compound  with  its  associated  small 
error  variance  is  preserved,  as  it  should  be,  until  the  latest 
possible  step  of  the  procedure,  whereas  in  forward  ranking  (IVOR) 
this  unity  could  not  be  reached  before  the  Rth  step,  and  possibly 
not  until  the  very  last  step.  A  numerical  example  in  which  the 
latter  actually  happens  is  also  given  in  Abt  [1965]. 

Only  when  both  ranking  procedures  result  in  equal,  or  nearly 
equal,  orderings  will  the  analyst  know  that  there  are  no  compounds 
(or  no  compounds  of  any  consequence)  present  among  the  independent 
variables.  The  only  protection  against  the  disturbing  effects  of 
compounds  upon  the  ranking  is  the  application  of  the  BIVOR  routine. 

It  is,  therefore,  strongly  recommended  to  always  use  the  BIVOR 
option  for  the  automatic  ranking  of  independent  variables.  Moreover, 
BIVOR  is  always  an  economical  choice  since  a  BIVOR  ranking  is  at 
least  4  times  faster  than  a  full  IVOR  ranking.  (For  computational 
details  and  problem  running  time  formulae,  see  Chapter  VI.) 

There  are,  however,  two  situations  in  which  IVOR  becomes  a 
desirable  option.  A  less- important  third  situation  is  discussed  in 
Section  VII. 2. b,  where  IVOR  is  shown  to  be  advantageous  in  finding  a 
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'perfect  fit."  The  first  situation  arises  when  a  large  series  of 
multiple  regression  problems  of  equal  structure  (with  the  same 
independent  variables  contained  in  the  model  for  each  problem)  have 
to  be  processed  and  when  the  following  two  conditions  hold  true: 

(a)  the  sum  of  the  BIVOR  running  times  would  be  excessive;  (b)  one 
is  only  interested  in  a  screening-type  investigation  as  to  the  first 
few  most  important  independent  variables  in  each  problem.  For  this 
situation  IVOR  has  a  cut-off  option  to  search  only  for  the  first 
"IQ"  most  important  variables,  where  IQ  is  a  control  card  input 
number.  (See  Card  Type  4,  Section  V.2.)  That  is,  IVOR  ceases 
ranking  after  step  number  IQ  and,  therefore,  does  not  rank  the  N-IQ 
least  important  independent  variables.  Naturally,  this  application 
of  the  IQ-option  of  IVOR  implies  the  risk  of  not  detecting  the  ef-fects 
of  possibly  existing  compounds  upon  the  ranking  order.  However,  this 
Is  the  price  for  saving  computing  time.  (For  IQ  much  smaller  than  N 
the  running  time  of  IVOR  is  considerably  shorter  than  that  of  BIVOR; 
see  time  formulae  in  Section  VI. 4.) 

The  second  situation  in  which  IVOR  becomes  desirable  also  calls 
for  the  cut-off  option  of  IVOR.  The  situation  arises  when,  in  a  given 
problem  with  many  independent  variables,  the  significant  IV 's  are  to 
ie  found,  but  the  final  model  is  to  be  kept  to  a  minimum  number  of 
independent  variables  in  order  to  obtain  small  standard  deviations 
cor  interval  estimation  purposes.  In  such  a  situation,  the  analyst 
should  apply  both  BIVOR  and  IVOR,  the  latter  with  an  IQ,  say,  in  the 
icinity  of  what  is  considered  to  be  the  maximum  number  of  independent 
/ariables  to  be  included  in  the  final  model.  If  there  are  no  com- 
•ounds,  it  is  possible  that  the  first  IQ  most  important  independent 
ariables  (or  a  subset  of  them),  as  ranked  by  IVOR,  account  for  a 
tigher  portion  of  the  total  regression  sum  of  squares  than  do  the 
, orresponding  number  of  the  most  important  independent  variables  in 
ilVOR .  However,  this  evidence  can  be  obtained  only  by  comparing  the 
esults  from  both  IVOR  and  BIVOR.  This  fact  serves  to  re-emphasize 
.he  importance  of  the  BIVOR  routine,  which  should  be  applied  for  the 
unking  of  the  independent  variables--alone  or  together  with  the  IQ- 
ption  of  IVOR --whenever  the  available  computer  time  allows  its  use. 

HI. 3  Comparison  of  IVOR  and  BIVOR  with  Other  Techniques 

The  rankings  of  the  independent  variables  as  done  in  IVOR  and 
'  IVOR  correspond  to  "forward”  and  "reverse"  ranking,  respectively, 

.  discussed  in  Abt  [1965].  The  IVOR  ranking  proceeds  in  the  same 
..uneral  forward  direction  as  the  "Stepwise  Multiple  Regression" 
technique  by  Bfroymson  [i960],  but  is  otherwise  different  from  that 
echnique,  as  is  obvious  from  reading  Sections  III. 2  and  Vl.l.d. 

Only  after  the  DA-MRCA  program  was  completed  in  its  present 
i  <rm,  a  paper  by  Hamaker  [1962]  came  to  the  attention  of  the  authors 
ui  which  two  computational  methods  are  discussed  for  the  successive 
i  tclusion  and  deletion  of  independent  variables:  "forward  selection" 
ui  "backward  elimination",  respectively.  The?*  two  methods  are 
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based  on  analyses  of  successive  residuals,  and,  therefore,  do  not 
immediately  seem  to  imply  results  which  could  be  identical  with  those 
of  IVOR  and  BIVOR,  respectively.  However,  the  numerical  results  of 
examples  exhibited  in  the  paper  certainly  suggest  this  both  with 
respect  to  the  ranking  orders  of  the  independent  variables  and  the 
associated  additional  regression  sums  of  squares.  No  attempt  has  been 
made  to  prove  the  general  equality  of  the  results  of  IVOR  and  "forward 
selection"  or  of  those  of  BIVOR  and  "backward  elimination." 

As  mentioned  in  Section  III. 2,  IVOR  and  BIVOR  are  not  ideal  but 
are  considered  adequate  for  the  purpose  of  ranking  independent  variables 
by  order  of  importance  and,  thereby,  finding  a  "significant  model." 

Naturally,  the  ideal  method  for  determining  the  "significant 
model"  would  be  to  find  the  most  important  IV  as  in  the  first  step 
of  IVOR,  but  then  to  deviate  from  IVOR  as  follows.  In  the  second 
step  all  ^N(N-l)  possible  pairs  of  IV 's  would  be  included  in  the  model, 
and  the  one  with  the  largest  prediction  power  would  be  selected  as  the 
most  important  pair.  Correspondingly,  in  the  third  step  the  most 
important  triple  of  IV 1 s  would  be  found,  etc.  Since  the  most  important 
pair  of  IV's  would  not  necessarily  contain  the  most  important  single 
IV  found  in  the  first  step  (and  correspondingly  for  the  triple  versus 
the  pair,  and  so  on)  a  unique  ranking  would  not  necessarily  result  from 
this  procedure.  The  significant  model,  however,  would  be  found  at  the 
step  where  the  F  value  (III-l)  is  non-significant  for  the  first  time, 
and  the  procedure  could  be  stopped  at  this  point.  This  "ideal" 
technique  may  be  feasible  for  small  values  of  N,  but  for  larger  N, 
such  as  IVOR  and  BIVOR  are  capable  of  handling,  the  indicated 
technique  is  infeasible  with  even  the  largest  computer  equipment 
available  at  the  present  time.  In  order  to  illustrate  this,  the 
following  comparison  of  estimated  minimum  computer  times  (in  seconds, 
on  the  IBM  7030  STRETCH)  for  the  "ideal"  technique  to  the  actual 
running  times  of  BIVOR,  according  to  formula  (VI-23)  in  Section  VI4, 
is  given. 


'  - — — - — 

8 

16 

32 

"Ideal"  technique  for 

finding  significant 
model 

24 

25400 

6.9  x  103 

BIVOR 

6 

13 

71 

Ratio 

4 

•1950 

-10 

This  tablu  shows,  for  example,  that  with  N- 16  independent 
variables  in  the  model,  the  estimated  minimum  computer  time  on  the  IBM 
7030  for  the  "ideal"  technique  is  25400  seconds,  which  is  approximately 
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1950  times  the  number  of  seconds  BIVOR  would  need  to  rank  the  16  IV's. 
For  N=32,  the  figure  is  6.9  billion  seconds,  whereas  BIVOR  needs  a  mere 
71  seconds.  The  times  for  the  "ideal"  technique  are  based  on  the 
assumption  that  all  2N-1  combinations  of  the  IV's  are  examined. 
Naturally,  these  times  would  be,  on  the  average,  much  smaller  if  the 
procedure  were  stopped  after  the  significant  model  was  found.  However, 
the  analyst  could  not  predict  at  which  step  this  would  happen,  and  he 
probably  would  have  to  consider  the  rimes  based  on  the  2N-1  combinations 
The  result  would  be  only  the  significant  model,  with  no  indication  as 
to  the  relative  importance  of  either  the  IV's  contained  in  the 
significant  model  or  of  those  not  contained  in  the  significant  model. 

Nevertheless,  when  N  is  sufficiently  small,  the  program  user  can 
apply  the  "ideal"  technique  by  using  the  option  for  hand  selections  of 
independent  variables .  The  nurber  of  hand  selected  reruns  is  restricted 
in  one  regression  problem,  to  999.  (See  Section  V.2,  Card  Type  2, 
columns  5-7.)  Therefore,  N=9  is  the  upper  limit  for  the  number  of 
independent  variables  contained  in  a  model  which  is  to  be  analyzed  by 
the  "ideal"  technique:  29-!  =  511.  However,  the  analyst  has  to 
specify  each  combination  of  independent  variables  required  by  the 
"ideal"  technique  or:  a  rerun  card  (see  Section  V.2,  Card  Type  10). 

In  other  words,  the  technique  cannot  be  executed  automatically  by 
DA-MRCA . 

Gorman  and  Toman  [1966]  have  recently  suggested  a  modification 
of  the  "ideal"  technique  by  applying  fractional  factorial  plans  to 
sample  the  2N-1  possible  combinations  of  IV's  in  order  to  reduce  the 
computational  effort  required  for  the  "ideal"  technique. 
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IV.  DEFINITIONS  FOR  INPUT.  COMPUTATIONS.  AND  PRINTOUT 

In  this  chapter  the  definitions  of  technical  terms  which  are 
used  in  the  following  chapters  are  listed  alphabetically.  (Some  of 
these  terms  have  already  been  used  in  the  previous  chapters.)  This 
list  of  definitions  includes  such  familiar  terms  as,  for  example, 
"independent  variable"  and  "data  matrix."  However,  since  such 
terms  are  often  used  in  the  literature  with  varying  shades  of 
meaning,  the  authors  decided  to  include  these  in  the  list  because 
a  clear  definition  was  considered  necessary  for  the  present  purpose. 

In  the  wording  of  each  definition  all  the  terms  which  are 
defined  elsewhere  in  the  list  are  marked  by  a  dashed  underline.  The 
definitions  are  as  follows: 

A  -  The  symbol  used  for  the  matrix  of  the  normal  equations. 

Accepted  Run  -  A  run  which  passes  all  5  tests  concerning  the  feasibility 
and  accuracy  of  the  solution  of  the  normal  equations  associated 
with  the  regression  model  for  the  given  run.  The  five  tests  are 
those  on  the  determinant,  Ra,  sa ,  the  cvv,  and  the  ivv .  For 
details  see  paragraphs  B,  D,  E,  F,  and  H  of  Section  VI. 2. a. (2). 

Additional  Regression  Sum  of  Squares  -  In  the  HainTheorem  the  regression 
sum  of  squares,  SSn„n* ,  due  to  the  addition  of  a  specified  subset 
of  N-N '  independent .variables  to  the  model  containing  the  N' 
independent  variables. 

ASSR  -  "Adjusted  (for  the  mean)  Sum  of  Squares  due  to  Regression."  For 
the  algebraic  formulation  of  ASSR  see  Section  VI. 3. a.  The  term 
is  used,  in  the  report,  in  two  applications: 

(1)  ASSR*  *  ASSR  value  due  to  K  indepeQden^ya jifbJej , 

(2)  ASSR(x. ,xa , . . .)  *  ASSR  value  due  to  the  set  (X},Xg,...)  of 
independent  variables. 

BIVOR  -  "Backward  Independent  Variable  Ordering  by  Regression  sums  of 
squares."  BIVOR  is  an  optional  subroutine  which  ranks  the 

in  ascending  order  of  importance  according 
to  their  contribution  to  the  total .tegress ion .fum.of .squares . 

See  Section  III. 2  and  Section”vi!I?e  for ’further "explanation. 

Calculated  Identity  Matrix  -  See  definition  of  "identity  matrix." 

Card  Type  -  One  of  the  ten  types  of  cards  which  constitute  the  (toblem 
deck.  Each  type  of  card  is  punched  according  to  the  input'  ~ 
explanation  and  format  given  in  Section  V.2. 

cVy*  ■  The  element  in  the  (v*l)th  row  and  (v'+i)th  column  of  the  inverse, 
A'1,  of  the  fftrix  of  the  normal^uations.  (v,v*  -  0,1,2 . K) . 
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Coding  -  A  term  sometimes  used  tor  the  transformation  of  the  coordinates 
of  the  OCIV's  to  increase  the  computational  accuracy,  where  the 
specific  transformation  recommended  is  v=(x-x)/Rx*  See  Section 
VII. 2. a  for  further  discussion. 

Coordinate  -  The  numerical  value  of  an  indegendent_var iable  or  of  the 
SJSlM-ndent  .variable  specifying,  for  the  corresponding  variable, 
the  location  of  the  design  point  or  the  data  point.  The  observed 
numerical  values  of  the  dependent  variable  and  the  OCIV^s  are 
sometimes  referred  to  as  "observed  coordinates",  in  contrast 
to  the  computed  coordinates  of  the  GCIV^s. 

Data  Matrix  -  The  nx(K+l)  matrix  consisting  of  the  n  data.jjoings .  With  K=N  or 
K-N'JJ,  the  data  matrix  is  defined  for  the  main  run  or  for  any 
rerun,  respectively.  The  data  matrix  is  printed  only  for  the 
main  run,  see  Section  VI. 3. a. 

Data  Point  -  A  point  specified  by  its  K+l  coordinates  in  the  (K+l)- 

dimensional  space  which  is  defined  by  the  K  independent  variables 
and  the  dependent  variable.  With  K=N  or  K=NT<N,  a  data  point  is 
defined  for  the  main  run  or  for  any  rerun,  respectively.  The 
number  of  data  points  (not  necessarily  all  distinct)  in  a  given 
regress ion _problem  is  called  n.  As  can  be  seen,  a  data  point  is 
3eflned  by  the  coordinates  of  a  design_goint  and  the  coordinate 
of  the  dependent  variable.  Since  several  data  points  can  be 
based  on  a  common  design  point,  one  has  n 'n,  ,  where  n.,  is  the 
number  of  distinct  input  design  points  in  the  space  defined  by 
the  K  independent  variables. 

Dependent  Variable  -  The  response  variable,  y  (random),  for  which  a 
numerical  value  y«  ,  i  -  l,2,...,n,  is  observed  at  each  one  of 
the  n  observed  (not  necessarily  all  distinct)  input_design_goints . 

Design  Matrix  -  The  n  x(K+l)  matrix,  denoted  by  X,  of  the  n  CttQCdioates 
of  the  K  independent, variables,  augmented  by  a  column  vector  of 
n  l's  for  the  constant,  xc“l.  With  K=N  or  K^N'^N,  the  design 
matrix  is  defined  for  the  main_run  or  for  any  rerun4  respectively. 
Each  row  of  the  design  matrix  represents  an  input.des ign_ point , 
not  necessarily  all  different. 

Design  Point  -  A  point  specified  by  its  K  coordinates  in  the  K-dimensional 
space  which  is  defined  by  the  K  independent .variables .  With  K -N 
or  KsN^N,  a  design  point  is  defined  for  the  main.run  or  for  any 
regun,  respectively.  The  symbol  used  for  a  design  point  is 
»*a  » •  •  •  »Xy , .  • .  ,x«  . 

Distinct  Design  Point  -  A  design. point  specified  by  a  unique  combination 
of  K  coordinates.  With  K-N  or  K=N*^N,  a  distinct  design  point 
is  defined  for  the  niain  run  or  for  any  rerun,  respectively.  The 
number  of  distinct  input  design  points  in  a  given  run  with  K 

is  called  nK  .  In  case  of  a  rerun  (K--N,''N) 


23 


NWL  REPORT  NO.  2035 


the  number  nK  is  defined  only  for  an  independent _variable 
selection  containing  a  specific  set  of  K=N'  IV's.  It  should 
be  noted  that  nK  is  equal  for  all  those  IVS's  in  a  given 
regress  ion _groblem  which  contain  the  same  OCIV^s . 

Evv*  -  The  element  in  the  (v+l)c^  row  and  the  (v'+l)11*1  column  of  the 
Situ 15 _2 f .£he_normal _equat ions .  Algebraically, 

n 

Evv*  =  Z  xvlxv»  !  .  ( v,  v'  =  0,1,2,...  ,K)  . 

i=l 

EVy  -  The  element  in  the  (vi-l)*-*1  row  and  the  (N+2)t*1  column  of  the 
summationjnatrix.  (v  =  0,1,...,N).  Algebraically, 

n 

E-y jf  =  ^  xvtyi  . 

i=l 

Eyy  -  The  total  sum  of  squares  of  y,  unadjusted  for  the  mean. 
Algebraically,  n 

Eyy  =  Z  yf  . 
i=l 

Eyy  is  the  lower  right  hand  corner  element  of  the  summation 
matrix. 

GCIV  -  "Generated  Concomitant  Independent  Variable."  A  GCIV  is  an 
independent .variable  which  is  generated  from  powers  and/or 
cross-products  of  A  GCIV  may  also  be  called  a  "product 

term." 

Generated  Independent  Variable  -  See  GCIV. 

Hand  Selected  Rerun  -The  desired  regression  computations  which  are 
performed  for  a  model  containing  a  specified  subset  of  N*<N 

where  the  particular  set  of  N'  independent 
variables  is  indicated  on  a  punched  card  (Gatd.XXM  10,  see 
Section  V.2)  in  the  groblem_deck . 

Ie  -  The  symbol  used  for  the  calculated  identityjmatrix. 

Identity  Matrix  -  The  (K+l)  x  (K+i)  matrix,  denoted  by  I9,  resulting 
from  multiplying  the  inverse  of  the  .tte.QVCMi 

equations  by  the  matrix  A  itself  (in  this  sequence):  I,«A-lA. 
WithK«N~or  K*N^N,  the  identity  matrix  is  defined  for  the 
V&iO.CUO  or  for  any  ctcuQ»  respectively.  The  identity  matrix 
is  computed  in  each  run  in  order  to  check  the  accuracy  of  A”1. 

For  details  see  Sections  Vl.l.b  and  VX.2.a.(2). 


24 


NWL  REPORT  NO,  2035 


Independent  Variable  -  One  of  the  non-random  variables,  xv  ,  in  the 
linear  regression  model,  whose  prediction  capacity  for  the 
dependent _variable,  y,  is  being  investigated  by  a  regression 
analysis.  See  also  the  definitions  of  OCIV  and  GCIV  .  For 
further  discussion  see  Sections  11,1  and  XI. 2. 

Independent  Variable  Selection  -  A  subset  of  N1  of  the  h  independent 
variables  originally  input  for  a  given  regress ion_groblem . 

In  the  corresponding  rerun,  the  regression  computations  are 
performed  for  the  model  containing  these  N'  independent 
variables.  Independent  variable  selections  may  be  done 
"by  hand"  (see  Section  III.l  and  Card_Type  10  in  Chapter  V) 
c*.  automatically  by  IVOR  and/or  BIVOR.  Not  every  independent 
variable  selection  will  necessarily  lead  to  all  desired 
computations  of  a  rerun. 

Input  Design  Point  -  A  design_poin£  specified  by  its  K  observed  or 

measured  coordinates  in  the  K-dimensional  space  defined  by  the 
K  independent _yariab les  (both  OCIV^s  and  GCIV^s)  for  which  an 
observed  or  measured  value  of  the  dSBeodest-YdE table  exists. 
With  K=N  or  R=N^N,  an  input  design  point  is  defined  lor  the 
main_run  or  for  any  rerun,  respectively.  The  number  of  distinct 
input  design  points  for  any  run  (with  K  independent  variables) 
is  called  n« .  An  input  design  point,  as  the  name  suggests,  is 
part  of  the  data  input  for  the  program.  However,  the  actual 
input  writing  is  done,  in  DA-MRCA,  only  for  the  coordinates  of 
the  OCIV's,  whereas  the  coordinates  of  the  GCIV's  may  auto¬ 
matically  be  computed  by  the  program. 

ivv»  *  The  element  in  the  (v+Dth  row  and  (v'+l)ch  column  of  the 

v  >v '  =  0,1,2, ...  ,K)  . 

IV  -  "Independent  Variable"  (see  definition). 

IVOR  -  "Independent  Variable  Ordering  by  Regression  sums  of  squares." 
IVOR  is  an  optional  subroutine  which  ranks  the 
VtC tables  in  descending  order  of  importance  according  to  their 
contribution  to  the  SQtai_cegie84taQ.aV®.«f-Sau«e§ •  See 
Section  IXI.2  and  Section  Vl.l.d  for  further  explanation. 

IVS  *  "Independent  Variable  Selection"  (see  definition)  . 

K  -  The  number  of  iQ^epen^eQ^variablej  in  a  given  jun.  In  the 

Mta.CUQ*  lu  •  CRCVO*  K»N'<N,  i.e.,  K  equals  the  number  of 

the  independent  variables  contained  in  the  specific  ta&BfO&Qfc 
variable  selection  of  the  given  rerun. 
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Leftmost  Group  -  In  fVOR  and  §IYQR>  the  first  gn.  of  indepeij^Qt 

ya^iableg,  according  to  the  int  ut  and  generation  sequence,  as 
designated  by  Card  Type  4  and  Card  Type  5,  respectively. 

(See  Section  V.2.J  The  leftmost  group  in  IVOR  is  the  first 
group  of  independent  variables  to  be  ranked,  whereas  in  BIVOR 
the  leftmost  group  is  the  last  group  of  independent  variables 
to  be  ranked. 

Leftmost  IV  -  At  a  given  step  of  IVOR  and/or  BIVOR,  the  first  (according 
to  the  input  and  generation  sequence)  unranked  independent 
variable  in  a  given  group  of  independent  variables. 

Main  Run  -  The  regression  computations  which  are  performed  for  the 

model  containing  all  N  independent  variables  originally  input 
for  a  given  r§gre§sioQ_grgblem . 

Main  Theorem  -  The  theorem  of  multiple  regression  on  which  all 

hypothesis  testing  and  rgnking.of ^iQ^ep^Qdent.Yatiables  are 
based  in  DA-MRCA.  See  Section  III.l  for  a  full  discussion. 

Matrix  of  the  Normal  Equations  -  The  (K+l)  x  (K+i)  symmetric  matrix 

denoted  by  A  and  formed  by  pre-multipiying  the  design_maf rix,  X, 
by  its  transpose,  X ' .  For  the  full  algebraic  representation 
of  A  see  Section  VI. 3. a.  With  K=N  or  K=N'<N,  the  matrix  of 
the  normal  equations  is  defined  for  the  qjgio.cuQ  or  for  any 
rerun,  respectively. 

n  -  The  number  of  data  points  input  in  one  regression  problem. 

(n*  7000) . 

N  -  The  number  of  i&depsudvot-VacUbUs  (QQIYls  and  GCIYls)  contained 
in  the  original  regression  model,  i.e.,  in  the  model  of  the 
WSiQ.CVlQ-  (N  .50)  . 

N '  -  The  number  of  independent .variables  (OCIV^s  and  GCIV's)  contained 

in  the  model  of  a  rerun. 

n«  -  The  number  of  ^isfclntfc.defcigQ.eginjs  in  a  given  ryQ  with  a 
specific  set  of  K  independent .variables  contained  in  the 
regression  model. 

Non-Obvious  Linear  Dependency  *  A  linear  dependency  among  two  or 

more  rows  (columns)  of  the  when 

the  dependency  is  not  obvious  in  the  sense  of  the  "obvious 
linear  dependency"  (see  definition). 
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Non-Zero  Error  Perfect  Fit  -  A  gerfect_fit  in  the  case  where  the 

nuiober  of  data  points,  n,  is  larger  than  the  number,  nK,  of 
dis£inc(_design_ points  input:  n>nK(=K+l).  This  term  is 
used  only  when  a  distinction  from  a  5?ro_error _gerfect _f it 
appears  to  be  necessary. 

Obvious  Linear  Dependency  -  A  linear  dependency  among  two  or  more 

rows  (columns)  of  the  WttiS-Uf -tbe.UGMMi.fiqMtiQQS  when  the 
cause  for  the  dependency  can  immediately  be  recognized  from 
the  number  and/or  constellation  of  the  nK  distinctdesign 
points.  See  Section  VII. 2. b.  for  more  details. 

OCIV  -  '^Original  Concomitant  Independent  Variable."  An  OCIV  is  an 
484qe€Q4€Q£..Y§£iSbl&  which  has  physically  been  observed  or 
measured  for  each  value  of  the  ^egegdent .variable .  (The 
auxiliary  variables  used  for  the  main  effects  in  the  multiple 
regression  approach  to  analysis  of  variance,  see  Section  11.3, 
are  also  considered  as  OCIV's  with  respect  to  the  method  of 
input  into  the  program.)  The  term  OCIV  is  used  to  differentiate 
this  type  of  independent  variable  from  a  GCH.  The  adjective 
"concomitant"  stems  from  the  concept  of  analysis  of  covariance 
to  which  DA-MRCA  can  also  be  applied.  To  distinguish  OCIV's 
from  GCIV's,  the  OCIV's  are  sometimes  given  the  symbols  Zj, 
j  »  1 , . . . ,IR,  where  IR  is  the  number  of  OCIV's. 

Original  Independent  Variable  -  See  OCIV. 

Perfect  Fit  -  The  least  squares  fit  in  the  case  where  the  number  of 
4i3tiQCt-4eii8Q.Bfi4Qtft  input,  n«,  equals  the  number  of 
4a4€B€Q4$Qt..V4£iffelM  in  che  model,  plus  1:  n,-K+l.  See  also 

the  definition  for  "zero  error  perfect  fit"  and  for  "non-zero 
error  perfect  fit." 

P oversum  -  A  term  sometimes  used  in  the  discussion  of  GQ(Y'f  where  it 
stands  for  the  sum  of  the  exponents  of  all  OCIV.s  which  are 
contained  in  the  GCIV .  For  example,  the  power sum  of  the  GCIV 
x 1X3X3  is  6. 

Predicted  Value  (•  Prediction)  •  The  value  (t)of  the  4feM4f*lt.¥«ti<feit 
as  computed  by  evaluating  the  regression  line  (least  squares 
fit)  for  a  given  model  at  an  iOBUt^tUa.MiOt  or  a  tYQttetU 
design_goin£. 

Prediction  Error  -  The  deviation  (e)  of  the  input  value  (y)  of  the 

d«MQd«QC.mUblt  from  the  Bt«4Ut«4.Y«iU«  (f)  of  the  dependent 
variable  for  any  Ueut.4tt4S9.Miat  tn  *  8iv*°  SWh 
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Prediction  Power  -  A  term  used  for  a  characteristic  of  an  individual 
independent _ variable  or  a  group  of  IV  js  with  respect  to  the 

•  The  prediction  power  is  measured  by  the 
addit ional_regress ion _sum_of_ squares  due  to  the  individual 
I V  or  the  group  of  IV 's.  See  also  Chapter  III. 

Prediction  Standard  Deviation  for  Individual  Observations  -  The 

estimate  (s^pp  of  the  standard  deviation  of  a  prediction  in 
a  given  run  at  a  specified  design_goint .  The  prediction 
standard  deviation  may  be  computed  for  a  §elected_ingut_desigo 
BQlSt  or  a  &yn£h§fciQ_d§§igO_BQiQ£  and  is  used  in  the  computation 
of  confidence  limits  for  individual  future  observation; 

(tolerance  limits)  of  the  dependent .variable .  (See  Section  VI. 3.) 

Prediction  Standard  Deviation  for  the  Prediction  Line  -  The  estimate 
(s.pO  of  the  standard  deviation  for  the  prediction  line 
(regression  equation)  in  a  given  cuo  at  a  specified  design_goint . 
The  prediction  standard  deviation  for  the  prediction  line  may 
be  computed  for  a  selec£ed_ingut_design_pqint  or  a  synthe£iq 
d§aigo_BQiat  and  is  used  for  the  computation  of  confidence 
limits  for  the  prediction  line.  (See  Section  VI. 3.) 

Problem  Deck  -  The  deck  of  punched  cards  which  constitute  the  program 
input  for  one  tggcgsgion.grobleiji.  The  problem  deck  consists  of 
cards  of  Types  1-10,  see  Section  V.l. 

Product  Term  -  A  synonym  for  GCIV . 

Program  Deck  -  The  deck  of  punched  cards  containing  the  input-output 
requirements  (see  Section  VIII. 3)  and  the  program  instructions 
which  are  coded  in  FORTRAN  IV  for  the  IBM  7030  Computer.  The 
program  deck  and  the  problem  deck  together  constitute  the  total 
card  input  for  a  regression  problem. 

Program  Variable  -  A  program  input  parameter  whose  value  is  to  be 
specified  by  the  program  user  for  each  regression_problem. 

Ranking  of  Independent  Variables  -  A  process  automatically  executed 
by  IVOR  or  BIVOR,  sometimes  also  referred  to  as  "ordering"  of 

iv_:i:~ 

Regression  Problem  -  The  totality  of  all  phases  of  the  regression 
analysis  to  be  performed  on  one  set  of  n  data_goints  as 
specified  by  one  2roblem_deck.  A  regression  problem  might 
include,  therefore,  the  main_run  and  several  rerun: ,  IVOR  and 
BIVOR,  the  Chi-square  test  on  normality  of  residuals  in  all  runs, 
and  other  optional  features. 

Regression  Sum  of  Squares  Adjusted  for  the  Mean  =  ASSR.  See  definition 
of  ASSR. 
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Rejected  Run  -  A  run  which  is  not  an  accegted  run^  i.e.,  a  run  which 
fails  one  of  the  5  tests  mentioned  in  the  definition  of  an 
accepted  run. 

Rerun  -  The  desired  regression  computations  which  are  performed  for 
a  model  containing  a  specified  subset  of  N^N  independent 
Y§£i§bl£5,  i.e.,  the  computations  performed  for  a  specified 
independent .variable _ select igni  A  rerun  can  be  specified 
automatically  or  "by  hand." 

Restricted  Admissibility  -  A  term  used  in  connection  with  the  ranking 
procedures  IVOR  and  BIVOR.  When  ranking  polynomial  terms,  or 
auxiliary  variables  in  non-orthogonal  analysis  of  variance,  it 
is  sometimes  not  advisable  to  consider  all  unranked  IYls  at 
a  given  step  for  ranking  at  that  step.  See  Sections  II. 3  and 
VII. 2. a  for  more  details.  Restricted  admissibility  can  be 
effected  by  the  grouping  of  IV's  in  IVOR  and  BIVOR,  see 
Sections  Vl.l.d  and  VI. I.e. 

Rightmost  Group  -  In  IVOR  and  BIVOR,  the  last  group  of  independent 

variables,  according  to  the  input  and  generation  sequence,  as 
designated  by.Card.Type  4  and  Card  Type  5,  respectively.  (See 
Section  V.2.)  The  rightmost  group  in  IVOR  is  the  last  group 
of  independent  variables  to  be  ranked,  whereas  in  BIVOR  the 
rightmost  group  is  the  first  group  of  independent  variables 
to  be  ranked. 

Rightmost  IV  -  At  a  given  s£eg  of  IVOR  and/or  BIVOR,  the  last 

(according  to  the  input  and  generation  sequence)  unranked 
independent  variable  in  a  given  group  of  independent  variables. 

Run  -  The  totality  of  all  desired  phases  of  the  regression  analysis 
to  be  performed  on  a  model  including  a  specified  set  of  K 
iQdependen£. .variables .  With  K=N  or  K=N'<N,  the  main. run  or  any 
rerun  is  included  in  this  definition. 

Selected  Input  Design  Point  -  An  input .design. point  selected  by  the 
program  user,  for  which  the  prediction  and  the  prediction 
standard  deviation  for  the  gredic tion_line  or  for  individual 
observations  are  to  be  computed. 

Significant  Model  -  A  regression  model  containing  all 

variables  which  contribute  significantly  to  the  to£al_regr<£§siQq 
iy^of  .squares  due  to  the  N  independent  variables  in  a  regression 
problemi  as  determined  by  reruns  and  the  associated  F  ratios  for 
regression  on  deleted  independent  variables, 

SSN_N*  -  See  definition  of  "additional  regression  sum  of  squares." 
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Step  (of  IVOR  or  BIVOR)  -  All  calculations  which  lead  to  the 

determination  of  an  independent, variable  to  be  included  in  or 
to  be  deleted  from  the  regression  model  in  IVOR  or  BIVOR, 
respectively. 

Summation  Matrix  -  The  (N+2)  x  (N+2)  symmetric  matrix  composed  of  the 
(N+l)  x  (N+l)  matrix, (A). ,o f _t  he  _norma  1  _e  qua t  ions  of  the  main 
ryri,  the  constants,  EVy,  of  the  normal  equations  (v  =  0,1,... ,N), 
and  the  sum  of  squares,  Eyy,  of  the  observations  of  the  dependent 
variable.  For  the  algebraic  representation  of  the  summation 
matrix  see  Section  VI. 3. a.  The  summation  matrix  is  defined  and 
printed  only  for  the  main  run. 

Synthetic  Design  Point  -  A  point  in  the  K-dimensional  space  defined  by 
the  K  independent, variables  of  a  given  tUQ  at  which  no  value 
of  the  dependent  variable  has  been  observed.  With  K=N  or 
K^N^N,  a  synthetic  design  point  is  defined  for  the  main  run 
or  for  any  rerun,  respectively.  The  K  coordinates  of  a 
synthetic  design  point  are  specified  by  the  analyst.  The 
concept  is  employed  in  an  optional  subroutine  which  computes 
predictions  and  predictions tandard _devi at ions _ for  the  predict ion 
line  or  for  individualobservations  at  specified  synthetic  design 
points . 

Total  Regression  Sum  of  Squares  -  A  term  sometimes  used  for  the  ASSR 

value  of  the  gain.Eun,  i.e.,  ASSRN  .  (The  main  run  contains  the 
"totality"  of  all  N  independent  variables  originally  considered 
in  the  regress  ion  _ problem,  hence  this  name  for  ASSRtvi .) 

xv i  -  The  symbol  used  for  the  numerical  value  (coordinate)  of 

independent, variable  xv  f°r  the  ith  data, point.  (i  ~  l,...,n; 
xoi  ^  1;  v  n  1,2,...,N  in  the  main, run .) 

yt  -  The  symbol  used  for  the  numerical  value  (c22ldiQ§£§)  of  the 
dependent  variable  for  the  ith  data  goint.  (i  --  l,...,n.) 

Zero  Error  Perfect  Fit  -  A  pecfict-fit  in  the  case  where  the  number 
of  data_goints,  n,  equals  the  number,  nK ,  of  dist inct_design_ 
points  input:  n=nK(=K+l).  The  zero  error  perfect  fit  leaves 
no  degrees  of  freedom  for  the  error  variance,  hence  the  name. 

For  further  discussion  see  Section  II. 4. 
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V.  INPUT  PREPARATION 

In  this  chapter  the  preparation  of  input  for  the  DA-MRCA 
program  is  described.  The  various  sections  of  the  chapter  give  the 
problem  deck  setup  (Section  V.l),  the  preparation  of  the  problem  deck 
(Section  V.2),  and  an  example  problem  deck  (Section  V.3). 


Problem  Deck  Setur 


The  problem  deck  for  the  general  case  is  listed  below  by  card 
type.  There  are  ten  card  types  required  for  the  general  case,  and  they 
are  designated  in  order  of  input  and  by  card  name.  For  specific  cases 
more  than  one  punched  card  of  a  particular  card  type  may  be  necessary. 
The  names  of  these  card  types  are  followed  with  an  "(S)"  to  denote 
the  plural  possibility.  The  explanation  of  each  card  type  and  the 
instructions  for  the  preparation  of  the  problem  deck  are  given  ii\  the 
next  section. 


CARD  TYPE  1  -  PROBLEM  IDENTIFICATION  CARD 


CARD  TYPE  2  -  PROBLEM  CONTROL  CARD 

CARD  TYPE  3  -  PRODUCT  TERM  DESCRIPTION  CARD(S)  (Optional)* 

CARD  TYPE  4  -  IVOR  CONTROL  CARD  (Optional)* 

CARD  TYPE  5  -  BIVOR  CONTROL  CARD  (Optional)* 

CARD  TYPE  6  -  SELECTED  INPUT  DESIGN  POINT  CARD(S)  (Optional)* 

CARD  TYPE  7  -  SYNTHETIC  DESIGN  POINT  CARD(S)  (Optional)* 


CARD  TYPE  8  -  DATA  INPUT  CARDS 


CARD  TYPE  9  -  DATA  TERMINATION  CARD 

CARD  TYPE  10  -  RERUN  CARD(S)  (Optional)* 


NOTE:  The  cards  whose  names  are  marked  with  asterisks  (*)  control 
optional  features  of  the  program  and  are  omitted  when  the  corresponding 
options  are  not  desired. 

The  problem  deck,  as  listed  above,  is  stacked  behind  the 
program  deck  and  constitutes  the  input  for  one  regression  problem.  The 
information  contained  on  the  DATA  INPUT  and  DATA  TERMINATION  CARDS 
(Card  Types  8  and  9)  may  be  placed  on  magnetic  tape  and  the  remainder 
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of  Che  problem  deck  prepared  on  cards.  Problem  decks  for  additional 
regression  problems  are  stacked  consecutively  behind  the  program  deck. 

Each  problem  deck  may  contain  a  different  combination  of  the  optional 
cards.  If  a  multiple  problem  case  utilizes  tape  data  of  the  types 
previously  specified,  the  tape  data  must  be  ordered  in  the  same  manner 
as  It  would  be  presented  as  parts  of  the  problem  decks.  Also,  for  the 
case  of  tape  input,  the  tape  identification  number  must  be  punched  on 
the  REEL  CARD  (third  card  of  the  program  deck)  starting  in  column  18. 

No  identification  number  is  necessary  for  card  input. 

V.2  Preparation  of  Problem  Deck 

In  this  section,  instructions  for  the  preparation  of  the  problem 
deck  are  given.  These  instructions  consist  of:  (a)  the  columns  in 
which  the  punched  entries  are  to  be  made;  (b)  the  input  formats;  (c) 
the  symbolic  names  of  the  program  variables  (when  applicable);  and  (d) 
explanations  of  the  punched  entries  associated  with  each  program  variable. 

To  facilitate  the  reading  of  the  input  instructions  for  the 
program  user,  who  may  be  unfamiliar  with  the  FORTRAN  language,  an 
explanation  of  the  various  format  specifications  used  to  describe  the 
input-output  data  of  DA-MRCA  follows.  Each  format  specification  con¬ 
tains  a  letter  indicating  the  type  of  information  which  must  be  input; 
also,  the  format  specification  contains  integers  which  control  the 
number  of  input  fields  to  be  used,  the  number  of  columns  in  each  field, 
and  the  regulation  of  the  assumed  decimal  point  if  the  decimal  point 
is  not  entered  on  the  input  card. 

Format  Specification  A  -  This  specification  is  of  the  form  Aw, 
where  A  indicates  that  the  input  can  be  alphanumeric  (alphabetical  or 
numerical)  and  the  w  indicates  the  number  of  columns  in  the  field.  By 
writing  a  repetition  number  in  front  of  the  A,  the  same  format  speci¬ 
fication  can  be  applied  to  several  successive  fields,  e.g.,  1QA8  means 
ten  eight -column  fields  of  alphanumeric  information. 

Format  Specification  I  -  This  specification  is  of  the  form  Iw, 
where  the  I  indicates  that  the  input  must  be  an  Integer  and  the  w 
indicates  the  number  of  columns  in  the  field.  Decimal  points  are  not 
permitted  and  all  input  entries  must  be  right  adjusted,  i.e.,  all 
entries  are  punched  in  the  column  or  columns  furthermost  to  the  right 
within  the  field. 

Format  Specification  X  •  This  specification  is  of  the  form  wX, 
which  means  that  a  field  of  w  columns  is  to  be  left  blank. 

Format  Specification  E  (Exponential)  -  This  specification  is  of 
the  form  Ew.d,  where  the  B  indicates  that  the  input  value  describes  a 
real  number  of  the  scientific  notation,  for  example,  a  number  of  the 
form  2.30xl04.  (The  actual  FORTRAN  representation  is  2.30E+04.)  The 
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w  indicates  the  number  of  columns  in  the  field.  The  d  indicates  the 
number  of  digits  to  the  right  of  the  assumed  decimal  point  if  an  actual 
decimal  point  is  not  punched.  A  repetition  number  written  in  front  of 
the  E  applies  the  same  format  specification  to  a  corresponding  number 
of  successive  fields.  In  EA-MRCA  the  E  format  is  used  for  the  input 
of  the  two  program  variables  TOLU  and  T0LI2  (Card  Type  2  of  the  problem 
deck,  see  below)  and,  if  specified,  for  the  input  of  the  coordinates  of 
the  OCIV's,  the  dependent  variable,  and  the  coordinates  of  the  synthetic 
design  points.  The  exponential  part  of  the  input  number  is  generally 
of  the  form  Etee;  however,  other  forms,  such  as  E±e,  ±ee  and  ±e,  are 
permissible.  Positive  exponents  can  also  be  expressed  as  Ee  or  Eee. 
Example:  The  input  values  +5879E+03,  .5879E+3,  +58.79+01  and  5879.-1 
would  all  read  as  587.9  if  the  input  format  specification  E9.4  is  used. 

Format  Specification  F  -  This  specification  is  of  the  form  Fw.d, 
where  the  F  indicates  that  the  input  value  describes  a  real  number 
without  an  exponent  notation;  the  w  indicates  the  number  of  columns 
in  the  field  and  the  d  specifies  the  number  of  digits  in  the  fractional 
portion  of  the  number.  (The  d-specif ication  is  overridden  by  a  punched 
decimal  point.)  A  repetition  number  written  in  front  of  the  F  applies 
the  same  format  specification  to  a  corresponding  number  of  successive 
fields.  In  DA-MRCA  the  F  format  is  use4  if  specified,  for  the  input 
of  the  coordinates  of  the  OCIV's,  the  dependent  variable,  and  the 
coordinates  of  the  synthetic  design  points.  Example:  The  input 
value  of  16897  would  be  read  as  1689.7  if  the  input  format  specifi¬ 
cation  of  F5.1  is  used. 

The  instructions  for  the  input  preparation  follow  below. 


CARD  TYPE  1  -  PROBLEM  IDENTIFICATION  CARD 

Column  Format  Program  Variable  Explanation 

1-80  1QA8  PGLB  Regression  Problem  Identification  Card. 

(Any  columns  may  be  used.) 


CARD  TYPE  2  -  PROBLEM  CONTROL  CARD 

Column  Format  Program  Variable  Explanation 

1-2  12  IR  Enter  the  number  of  original  concomitant 

independent  variables  (OCIV's)  whose 
coordinates  will  be  input  on  D6TA  INPUT 
CARDS  (Card  Type  8) . 


3-4  12  IS  Enter  the  number  of  generated  concomitant 

independent  variables  (GCIV's)  to  be 
computed  from  the  IR  OCIV's  (see  Card 
Type  3)  .  IR  +  IS  *  N  *  50. 
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CARD  TYPE  2  (Cont'd) 


Column  Format  Program  Variable  Explanation 

5-7  13  NR  Enter  the  number  of  hand  selected  reruns 

(see  Card  Type  10) .  Punch  a  0  if  only 
automatic  reruns  are  desired  as  selected 
by  IVOR  and/or  BIVOR.  0  ^  NR  5  999. 

8-10  13  MVP  Enter  the  number  of  synthetic  design 

points  to  be  read  from  Card  Type  7  - 
SYNTHETIC  DESIGN  POINT  CARD(S)  -  for 
which  the  computations  indicated  in 
column  14  of  the  present  card  will  be 
performed.  0  ^  MVP  ^  999. 

11-13  13  NDR  Enter  the  number  of  selected  input 

design  points  for  which  the  computations 
indicated  in  column  14  will  be  performed. 
The  selected  input  design  points  are 
denoted  on  Card  Type  6  -  SELECTED  INPUT 
DESIGN  POINT  CARD(S).  0  <  NDR  <  999. 


14  II  MVPL  0  =  Predictions  and  prediction  standard 

deviations  for  individual  observations 
will  be  computed  for  selected  input 
design  points  and/or  synthetic  design 
points  for  the  main  run  and  each  hand 
selected  rerun.  (The  standard  deviations 
can  be  used  to  construct  tolerance  limits 
for  individual  observations,  see  Section 
VI.3.b.(2)  .) 

1  =  Predictions  and  prediction  standard 
deviations  for  the  prediction  line  will 
be  computed  for  selected  input  design 
points  and/or  synthetic  design  points 
for  the  main  run  and  each  hand  selected 
rerun.  (The  standard  deviations  can  be 
used  to  construct  confidence  limits  for 
-he  prediction  line,  see  Section  VI.3.b.(2).) 

15  II  NPE  0  *  Predictions  and  prediction  errors 

will  not  be  printed  and  the  test  for 
normality  of  the  prediction  errors  will 
not  be  performed  for  hand  selected  reruns 
and  IVOR  and/or  BIVOR  reruns . 
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CARD  TYPE  2  (Cont'd) 


Column  Format  Program  Variable  Explanation 

1  =  Predictions  and  prediction  errors 
will  be  printed  and  the  test  for  normality 
of  the  prediction  errors  will  be  performed 
for  hand  selected  reruns  and  IVOR  and/or 
BIVOR  reruns . 

16  II  NDPO  0  =  The  coordinates  of  the  data  points 

will  be  printed  (in  the  data  matrix) 
in  the  format  9F13.6  and  the  predictions 
and  the  prediction  errors  will  be  printed 
in  the  format  2F15.6. 


1  =  The  coordinates  of  the  data  points 
will  be  printed  (in  the  data  matrix) 

in  the  format  7E17.8  and  the  predictions 
and  the  prediction  errors  will  be  printed 
in  the  format  2E15.6. 

2  =  The  coordinates  of  the  data  points 
will  not  be  printed  but  the  predictions 
and  the  prediction  errors  will  be  printed 
in  the  format  2FI5.6. 

17  II  TAPE  0  =  The  coordinates  of  the  OCIV's  and 

the  dependent  variable  and  also  the  data 
termination  indicator  will  be  input  on 
cards . 

1  =  The  above  will  be  input  on  magnetic 
tape.  (The  tape  identification  number 
must  be  entered  on  the  REEL  CARD  of  the 
program  deck  starting  in  column  18.) 

18  II  IVORGO  0  =  IVOR  and  BIVOR  will  not  be  used. 

1  =  IVOR  will  be  used. 

2  =  BIVOR  will  be  used. 

3  o  IVOR  and  BIVOR  will  be  used. 

19-20  12  NFD  Enter  the  number  of  data  fields  to  be 

read  from  each  DATA  INPUT  CARD  (input 
record,  if  tape  is  used)  as  indicated 
by  the  input  reading  format  (see  columns 
41-80) .  If  no  entry  is  given  or  if  a 
zero  is  entered,  seven  data  fields  will 
be  assumed. 
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CARD  TYPE  2  (Cont'd) 

Column  Format  Program  Variable  Explanation 

21  II  IBID  0  =  In  BIVOR,  the  identity  matrix  will 

be  computed  for  all  reruns  and  accuracy 
checks  will  be  performed  on  all  identity 
matrices  (see  columns  23-40) . 

1  =  In  BIVOR,  the  identity  computations 
and  accuracy  checks  will  be  terminated 
with  the  first  rerun  in  which  an  identity 
matrix  has  been  computed  which  satisfies 
the  accuracy  criteria  imposed  by  the 
value  of  1(1)  (see  columns  23-31). 

This  option  is  a  time-saving  device 
which  may  be  advantageously  applied  in 
cases  with  a  large  number  of  independent 
variables.  See  also  Section  VI. 2. d., 
paragraph  C. 

22  --  -  Leave  blank. 

23-31  E9.5  TOLU  Enter  the  value  of  1(1).  This  value 

will  be  used  as  the  accuracy  criterion 
for  controlling  the  printout  of  the 
identity  matrix  for  the  main  run  and 
each  rerun.  If  |  iVy*  I  -  1(1) ,  where 
L=1  when  v=v'  and  L=0  when  Wv',  the 
identity  matrix  will  be  printed.  For 
further  discussion  and  for  the  choice 
of  1(1)  see  Section  Vl.l.b.  Notice  that, 
according  to  the  format  specification, 
this  entry  does  not  have  to  be  right 
adjusted.  The  same  applies  to  the  next 
two  entries  (T0LI2  and  FORM)  . 

32-40  E9.5  TOLU  Enter  the  value  of  1(2),  where  1(2)2  1(1). 

1(2)  will  be  used  as  the  accuracy  criterion 
which  determines  acceptance  or  rejection 
of  the  regression  computations  for  the 
main  run  or  any  rerun.  If  \  iVW“l  I  2  1(2), 
the  run  will  be  rejected.  (NOTE:  1(2) 
applies  only  to  the  elements  of  the  main 
diagonal  of  the  identity  matrix.)  For 
further  discussion  and  for  the  choice 
of  1(2)  see  Section  Vl.l.b. 
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CARD  TYPE  2  (Cont'd) 

Column  Format  Program  Variable  Explanation 

41-80  5A8  FORM  Enter  the  format  specifications  by 

which  each  Card  Type  8  -  DATA  INPUT 
CARD  (data  input  record,  if  tape  input 
is  used)  is  to  be  read.  These  format 
specifications  do  not  include  the  first 
two  columns  of  each  DATA  INPUT  CARD 
which  must  be  left  blank.  All  coordinates 
of  a  data  point  may  be  read  in  the  same 
manner  by  using  a  simple  format  speci¬ 
fication  such  as  7F10.4  (see  Card  Type 
8).  H<»wever,  if  necessary  or  convenient, 
more  complex  format  specifications  may 
be  entered  whereby  the  various 
coordinates  of  a  data  point  may  occupy 
a  varying  number  of  columns.  For  example, 
if  a  record  format  of  F12.5,  5F10.0, 

F8.4  were  entered,  the  dependent  variable, 
the  first  five  OCIV's,  and  the  sixth  OCIV 
would  constitute  the  input  record  and 
will  be  read  by  these  formats,  respectively 
(N01B:  The  commas  must  be  entered  to 
separate  the  individual  formats.)  If, 
in  this  example,  more  than  six  OCIV's 
were  required  to  represent  a  data  point, 
the  additional  OCIV's  would  constitute 
another  input  record  and  would  be  read 
by  the  same  format  specifications  which 
means,  the  seventh  OCIV  would  be  read 
by  F12.5,  the  eighth,  ninth,  tenth, 
eleventh,  and  twelfth  OCIV's  would  be 
read  by  5F10.0  and  the  thirteenth  OCIV 
by  F8.4,  etc* 

If  MFD  ■  0  (columns  19-20)  the  format 
7F10.4  is  assumed  and  no  entry  is 
necessary  in  columns  41-80. 


naniy  rm  3  -  F80P0CT  H»t  «SCtPTIP8  CAKP(S)  (Optional! 

This  card  is  used  to  input  the  description  of  the  IS  product 
terms  (OCIV's)  which  are  to  be  generated  from  the  values  of  the  It 
original  concomitant  independent  variables  (OCIV's).  (See  columns  1*4 
of  Card  Type  2.)  Tmt  OCIV's  are  powers  and/or  cross-products  of  the 
OCIV's  and  are  generated  as  additional  independent  variables.  4 
product  term  description  designates  the  independent  variables  (OCIV's 
or  OCIV's)  which  are  to  be  used  as  multiplicative  factors  in  the 
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CARD  TYPE  3  (Cont'd) 

generation  of  a  GCIV.  Any  OCIV  may  be  used  as  a  factor  in  the 
generation  of  any  GCIV  and  any  previously  generated  GCIV  may  be  used 
as  a  factor  in  the  generation  of  a  subsequent  GCIV.  A  product  term 
description  consists  of  the  subscripts  of  the  independent  variables 
which  are  to  be  used  as  factors  in  generating  the  GCIV.  The  following 
example  case  (IR  =2,  IS  =  7,  N  =  9)  illustrates  the  procedure  for 
writing  product  term  descriptions.  (This  is  the  case  of  the  example 
problem  discussed  in  Sections  V.3  and  VI. 5.) 


IV 

OCIV 

GCIV 

Product  Term  Description 

Zl 

Not  applicable 

X3 

Zs 

Not  applicable 

x3 

Z1Z2 

1  2 

x4 

z! 

i  » 

xs 

zf 

2  2 

*e 

zfzs 

112  or  13  or  24 

x7 

ZlZ| 

12  2  or  15  or  2  3 

x8 

Si 

111  or  14 

Xq 

222  or  25 

As  many  as  ten  factors  may  be  designated  for  each  product 
term  description  and  four  product  term  descriptions  may  be  punched 
on  each  card  of  this  Card  Type,  ii  no  product  terms  are  to  be 
generated  (IS  »  0).  this  card  must  be  omitted  from  the  input  deck. 

Column  Format  Program  Variable  Explanation 

The  description  of  the  first  product 
term  occupying  up  to  20  columns  is 
entered  -n  columns  1*20  using  two 
column  fields  to  designate  the  factors: 

1*2  12  01(1,1)  Enter  the  subscript  of  the  independent 

variable  to  be  used  as  the  first  factor 
in  the  product  term. 


38 


NWL  REPORT  NO.  2035 


CARD  TYPE  3  (Cont'd) 

Column  Format  Program  Variable  Explanation 

3-4  12  1N(1,2)  Enter  the  subscript  of  the  independent 

variable  to  be  used  as  the  second  factor 
in  the  product  term. 


19-20  12  IN(1,10)  Enter  the  subscript  of  the  independent 

variable  to  be  used  as  the  tenth  factor 
in  the  product  term.  (The  description 
of  the  product  term  ZjZg  would  be  a  1 
in  column  2  and  a  2  in  column  4») 

The  descriptions  of  the  second,  third, 
and  fourth  product  terms  occupying  up 
to  20  columns  each  are  entered  in 
colunms  21-40,  41-60  and  61*80, 
respectively,  in  the  same  manner  as  the 
first  product  term  description. 

If  more  than  four  product  terms  are  desired  (IS  >  4),  cards  in 
the  same  format  are  added  as  needed. 


CARD  TYPE  4  -  IVOR  CONTROL  CARD  (Optional) 

The  information  which  is  input  on  this  card  determines  the 
conditions  under  which  IVOR  will  consider  the  independent  variables 
for  ranking.  The  independent  variables  can  be  divided  into  groups 
of  consecutive  independent  variables,  according  to  the  sequence  of 
input  and  generation,  whereupon  IVOR  ranks  the  variables  within  these 
groups  starting  with  the  first  group  (see  IVOR  explanation  in  Section 
Vl.i.d).  The  input  parameters  of  IVOR  are  the  number  of  variables  to 
be  ordered,  the  number  of  groups  into  which  the  variables  are  to  be 
divided  and  the  number  of  variables  in  each  group.  If  IVORGO  »  1  or 
3  (sea  cp(umg  18.  Card  Tvo#  2).  this  card  must  be  included  in  the 
input  deck.  If  IV0EQ0  >  0  or  2.  this  card  must  be  omitted  from  the 


39 


NWL  REPORT  NO,  2035 


CARD  TYPE  4  (Cont'd) 


Column  Format  Program  Variable 


1-2  12 


3-5  13 


6-8  13 


9-11  13 


NJ  ( 1) 


NJ(2j 


Explanation 

Enter  the  number  of  independent 
variables  to  be  ordered  by  IVOR.  If 
all  N  independent  variables  are  to  be 
ordered,  enter  0  or  leave  blank.  Other¬ 
wise 

Mi 

IQ  s  Z  Nj 

j=l 

where  Mj  is  the  number  of  groups  and 
N3  is  the  number  of  independent 
variables  in  the  jth  group. 

Enter  the  number  (MT)  of  groups  into 
which  the  set  of  independent  variables 
is  to  be  divided  for  ordering  within 
groups .  1  ^  Mj  s  25 . 

Enter  the  number  (Nx)  of  independent 
variables  rn  the  first  group. 

Enter  the  number  (N2)  of  independent 
variables  in  the  second  group. 


78-80  13 


NJ (25) 


Enter  the  number  (NS5)  of  independent 
variables  in  the  twenty-fifth  group 
(if  Mx  =  25) . 


In  order  to  consider  all  independent  variables  as  one  group, 
put  MI  =  Mj  =  1  and  NJ(1)  =  N:  =  IR  +  IS  =  N.  If  only  a  subset  of 
the  N  independent  variables  is  to  be  considered,  specify  this  by 


E  N,<N; 
j=l  ‘ 

however,  the  independent  variables  excluded  will  be  the  rightmost 
independent  variables  according  to  the  input  and  generation  sequence, 
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CARD  TYPE  5  -  BIVOR  CONTROL  CARD  (Optional) 

The  information  which  is  input  on  this  card  indicates  the 
conditions  under  which  BIVOR  will  consider  the  independent  variables 
for  ranking.  As  for  IVOR,  the  independent  variables  can  be  divided 
into  groups  of  consecutive  independent  variables,  according  to  the 
sequence  of  input  and  generation.  (The  number  of  independent 
variables  in  the  respective  groups  of  IVOR  and  BIVOR  may  be  entirely 
different.)  BIVOR  will  do  the  ordering  within  each  group  starting  with 
the  last  group  (see  BIVOR  explanation  in  Section  Vl.l.e).  If  IVORGO  = 

2  or  3  (see  column  18.  Card  Type  2).  this  card  must  be  included  in 
the  input  deck.  If  IVORGO  =  0  or  1,  this  card  must  be  omitted  from 
the  input  deck. 


Column  Format  Program  Variable  Explanation 

1-2  12  MB  Enter  the  number  (MB)  of  groups  into 

which  the  independent  variables  are 
to  be  divided  for  ordering  within 
groups .  1  £  Mb  £  25 . 


3-5  13  LOT(l)  Enter  the  number  (Nx)  of  independent 

variables  in  the  first  group,  which 
will  be  the  last  group  of  IV' s  ordered. 
(Nq  is  the  number  of  independent 
variables  in  the  qth  group.) 

6-8  13  LOT(2)  Enter  the  number  (N2)  of  independent 

variables  in  the  second  group,  which 
will  be  the  next  to  last  group  of-  IV 's 
ordered . 


75-77  13  LOT(25)  Enter  the  number  (N35)  of  independent 

variables  in  the  twenty-fifth  group 
(if  Mb  =  25)  which  will  be  the  first 
group  of  IV' s  ordered. 

In  order  to  consider  all  independent  variables  as  one  group, 
put  MB  =  M0  =  1  and  LOT(l)  =  Nj  =  IR  +  IS  =  N.  If  only  a  subset  of 
the  N  independent  variables  is  to  be  considered,  specify  this  by 

Mb 

2  N<N; 

q=i 

however,  the  independent  variables  excluded  will  be  the  rightmost 
independent  variables  according  to  the  input  and  generation  sequence. 
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CARD  TYPE  5  (Cont'd) 

NOTE:  The  program  variable  "LOT"  is  also  used  in  connection 
with  Card  Type  10  -  RERUN  CARD  -  where  it  represents  a  different  input 
parameter.  The  reader  who  is  interested  in  more  details  about  the 
variable  LOT  is  referred  to  Chapter  VIII. 

CARD  TYPE  6  -  SELECTED  INPUT  DESIGN  POINT  CARD(S)  (Optional) 

The  input  design  points  for  which  the  predictions  and  prediction 
standard  deviations  will  be  computed  (see  column  14,  Card  Type  2)  are 
indicated  on  this  card;  these  design  points  are  denoted  as  selected 
input  design  points.  Entries  made  on  this  card  refer  to  the  design 
points  according  to  their  order  of  input,  i.e.,  if  the  computations 
are  desired  for  the  design  point  that  was  input  first,  a  1  is  entered 
on  this  card,  if  the  computations  are  desired  for  the  design  point 
that  was  input  third,  a  3  is  entered  on  this  card,  etc.  The  computations 
are  performed  for  the  main  run  and  all  hand  selected  reruns.  There 
must  be  exactly  NDR  entries  (see  columns  11-13,  Card  Type  2)  on  this 
card  and  they  must  be  in  numerically  ascending  order.  If  NDR  =  0, 
this  card  must  be  omitted  from  the  input  deck.  NDR  <  999. 


Column  Format  Program  Variable  Explanation 


1-4 

14 

IKEEPR(l) 

Enter 

input 

input 

the  number  corresponding  to  the 
order  of  the  first  selected 
design  point. 

5-8 

14 

IKEEPR(2) 

Enter 

input 

the  number  corresponding  to  the 
order  of  the  second  selected 

# 

• 

• 

input 

design  point. 

77-80 

• 

I* 

IKEEPR(20) 

Enter 

input 

input 

• 

the  number  corresponding  to  the 
order  of  the  twentieth  selected 
design  point. 

IKEEPR(i)  <  IKEEPR(i  +  1)  for  i  =  1,2,  . . . , (NDR-1) .  Additional 
cards  are  used  if  NDR  >  20  and  are  continued  in  the  same  format. 
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CARD  TYPE  7  -  SYNTHETIC  DESIGN  POINT  CARD(S)  (Optional) 

The  synthetic  design  points  for  which  the  predictions  and  the 
prediction  standard  deviations  (see  column  14,  Card  Type  2)  will  be 
computed  are  specified  on  this  card.  A  synthetic  design  point  is 
specified  by  coordinates  of  the  IR  OCIV's  and  the  IS  GCIV's  at  which 
no  actual  experimentation  was  performed  or  no  observation  was  made. 

(The  coordinates  of  the  GCIV's  are  not  input  on  this  card  because 
they  are  generated  from  the  coordinates  of  the  OCIV's  by  the  instructions 
given  on  Card  Type  3.)  By  employing  the  feature  of  synthetic  design 
points  it  is  possible  to  obtain  predictions  and  prediction  standard 
deviations  for  arbitrarily  chosen  values  of  the  independent  variables. 

For  example,  the  feature  can  advantageously  be  used  for  interpolation. 

The  computations  are  performed  for  the  main  run  and  all  hand  selected 
reruns.  The  number  of  synthetic  design  points  input  must  equal  MVP 
(see  columns  8-10,  Card  Type  2) .  The  synthetic  coordinates  of  the  IR 
OCIV's  are  input  with  the  same  format  that  is  used  for  the  DATA  INPUT 
CARDS,  which  is  the  format  entered  in  columns  41  80  of  Card  Type  2, 
ignoring  columns  1  and  2;  however,  the  first  field  of  the  format 
(starting  with  column  3  of  the  first  card  of  Card  Type  7)  is  left 
blank  since  it  corresponds  to  the  first  field  of  the  DATA  INPUT  CARDS 
which  is  reserved  for  observations  of  the  dependent  variable.  Anything 
punched  in  this  field  will  be  ignored  by  the  program. 

An  explanation  of  the  preparation  of  this  control  card  is 
given  below  for  the  assumed  format  of  7F10.4.  If  MVP  =  0,  this  card 
must  be  omitted  from  the  input  deck.  MVP  ^  999. 


Column  Format  Explanation 


1-2  2X 


Leave  blank. 


3-12  10X 


Leave  blank. 


13-22  F10.4  Enter  "synthetic"  zlx,  the  value  of  the  first  OCIV 

for  the  first  synthetic  design  point. 

23-32  F10.4  Enter  "synthetic"  zai,  the  value  of  the  second  OCIV 

for  the  first  synthetic  design  point. 


63-72  F10.4  Enter  "synthetic"  z61,  the  value  of  the  sixth  OCIV 

for  the  first  synthetic  design  point. 

Under  the  assumed  format,  7F10.4,  which  is  used  here  as  an 
example,  and  if  6  <  IR  £  13,  a  second  card  would  be  needed  to  complete 
the  representation  of  the  first  synthetic  design  point.  This  second 
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card  would  be  read  with  the  same  format  (7F10.4)  with  the  exception 
that  columns  3-12  are  used  for  the  synthetic  value  of  the  seventh 
OCIV  (syn  z71) .  If  IR  2  13,  additional  cards  would  be  necessary  in 
order  to  completely  represent  the  first  synthetic  design  point,  and 
the  same  format  would  be  applied.  Succeeding  synthetic  design  points 
are  input  on  successive  cards  in  a  similar  manner. 

CARD  TYPE  8  -  DATA  INPUT  CARDS 


These  cards  are  used  to  input  the  observed  coordinates, 

(y;  zi j  z2j  zir)i>  oi  the  n  data  points,  where  IR  is  the  number 

of  OCIV’s  and  i  =  l,2,...,n.  The  numerical  values  are  entered  on  the 
cards  according  to  the  format  which  has  been  specified  in  columns 
41-80  of  Card  Type  2,  ignoring  columns  1  and  2.  If  more  than  one  card 
is  required  to  represent  each  data  point,  the  additional  cards  (con¬ 
taining  OCIV's  only)  will  be  read  by  the  same  format  specification. 

An  explanation  of  the  preparation  of  these  cards  is  given  below  for 
the  assumed  format  7F10.4  for  data  input. 


Column 

Format 

Explanation 

1-2 

2X 

Leave  blank. 

3-12 

F10.4 

Enter  yx  ,  the  observed  coordinate 
variable  for  the  first  data  point. 

of  the 

dependent 

13-22 

F10.4 

Enter  z11?  the  observed  coordinate 
for  the  first  data  point. 

of  the 

first  OCIV 

23-32 

F10.4 

Enter  zsi,  the  observed  coordinate 

of  the 

second  OCIV 

for  the  first  data  point. 


63-72  F10.4  Enter  z61,  the  observed  coordinate  of  the  sixth  OCIV 

for  the  first  data  point. 

Under  the  assumed  format,  7F10.4,  which  is  used  here  as  an 
example,  and  if  6  s  IR  ^  13,  a  second  card  would  be  needed  to  complete 
the  representation  of  the  first  data  point.  This  second  card  would  be 
read  with  the  same  format  (7F10.4)  with  the  exception  that  columns 
3-12  are  used  for  z71,  the  observed  coordinate  of  the  seventh  OCIV  of 
the  first  data  point.  If  IR  a  13,  additional  cards  would  be  necessary 
in  order  to  completely  represent  the  first  data  point,  and  the  additional 
cards  would  be  written  in  the  same  format  as  the  second  card.  The 
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coordinates  (y;  z1 ,  z2,  zIR)j  of  the  succeeding  data  points, 

where  i  -  2,3,...,n,  are  input  on  successive  cards  in  a  similar  manner. 
The  GC1V  coordinates  are  generated  using  the  OCIV  coordinates  which 
are  input  on  these  cards.  The  DATA  INPUT  CARDS  and  the  SYNTHETIC 
DESIGN  POINT  CARD(S)  are  identical  in  format;  however,  the  first 
field  of  the  DATA  INPUT  CARDS  contains  the  coordinates  of  the 
dependent  variable  and  the  first  field  of  the  SYNTHETIC  DESIGN  POINT 
CARD(S)  is  left  blank.  The  program  limitation  on  the  number,  n,  of 
data  points  is:  n  ^  7000. 

CARD  TYPE  9  -  DATA  TERMINATION  CARD 


Column  Format  Program  Variable  Explanation 

1-2  12  Ml  Enter  any  non-zero  value. 

If  the  information  on  Card  Type  8  is  on  tape,  the  information 
on  Card  Type  9  must  be  on  tape  and  must  have  a  record  length  given  by 
the  format  in  columns  41-80  of  Card  Type  2  (or  the  assumed  format, 
7F10.4)  plus  2  columns. 

CARD  TYPE  10  -  RERUN  CARD(S)  (Optional) 

This  control  card  provides  the  capability  of  deleting  any 
ombination  of  independent  variables  (OCIV's  or  GCIV's)  from  the 
iginal  model  and,  thereby,  repeating  the  regression  computations 
for  a  specified  independent  variable  selection  of  N'  <  N  IV's.  If 
ill  desired  phases  are  executed,  this  repetition  is  called  a  rerun. 

4  rerun  card  must  be  included  in  the  input  deck  for  each  rerun  that 
io  desired  and,  therefore,  NR  (see  columns  5-7,  Card  Type  2)  rerun 
:u  ds  are  needed.  Each  column  of  a  rerun  card  represents  an 
independent  variable  (OCIV  or  GGIV)  in  the  original  model  for  the  main 
run.  If  a  1  is  entered  in  the  column,  the  corresponding  independent 
variable  is  excluded  from  the  model.  If  a  0  is  entered  in  the  column, 
the  corresponding  independent  variable  is  included  in  the  model.  This 
> ard  must  be  omitted  from  the  input  deck  if  NR  =  0 .  NR  £  999. 

■  oluimt  Format  Program  Variable  Explanation 

1  II  Lot  (1)  Enter  a  zero;  this  column  represents 

the  constant  which  must  be  retained 
in  the  regression  model  for  all  runs. 
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Column  Format  Program  Variable 


Explanation 


2  II  Lot  (2)  This  column  represents  the  first 

independent  variable;  enter  a  zero  if 
it  is  to  be  retained  in  the  model  or 
enter  a  one  if  it  is  to  be  deleted  from 
the  model . 


3  II  Lot  (3)  This  column  represents  the  second 

independent  variable;  enter  a  zero  if  it 
is  to  be  retained  in  the  model  or  enter 
a  one  if  it  is  to  be  deleted  from  the 
mode 1 . 


51  II  Lot  (51)  This  column  represents  the  fiftieth  (if  N=50) 

independent  variable;  enter  a  zero  if 
it  is  to  be  retained  in  the  model  or 
enter  a  one  if  it  is  to  be  deleted  from 
,  the  model. 

Subsequent  rerun  cards  are  written  in  the  same  format. 


Example  Problem  Deck 


A  card  layout  of  the  problem  deck  for  the  example  problem 
which  is  discussed  in  Section  VI. 5  is  given  on  the  following  page. 
An  explanation  for  each  card  of  the  problem  deck  is  also  provided. 
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Card  Type  Column 
1  1-80 

2  1-2 

3-4 
5-7 
8-10 
11-13 

14 


15 


16 


17 


Explanation 


Identification  of  the  problem. 

IR=2;  two  OCIV's  (zx  and  z2)  are  input. 

IS=7;  seven  GCIV's  are  to  be  generated. 

NR=1;  one  hand  selected  rerun  is  to  be  executed. 

MVP-3;  three  synthetic  design  points  are  to  bo  input. 

NDR-2;  two  selected  input  design  points  will  be 
specified. 

MVPL=1;  predictions  and  prediction  standard 
deviations  for  the  prediction  line  will  be  computed 
for  the  3  synthetic  design  points  and  the  2  selected 
input  design  points  for  the  main  run  and  the  hand 
selected  rerun. 

NPE=1;  prediction  and  prediction  errors  will  be 
computed  and  printed  and  the  Chi-square  test  for 
normality  of  the  prediction  errors  will  be 
performed  for  all  reruns. 

NDP0=1;  the  coordinates  of  the  data  points  will  be 
printed  in  the  format  7E17.8  and  the  predictions 
and  the  prediction  errors  will  be  printed  in  the 
format  2E15.6. 

TAPE=U;  DATA  INPUT  and  DATA  TERMINATION  are  on 
cards . 


18  IV0RG0=3 ;  both  IVOR  and  BIVOR  will  be  used. 

19-20  NFD=3;  there  are  three  data  fields  on  each  DATA 
INPUT  CARD. 

21  1BID=0;  the  identity  matrices  will  be  computed  for 

all  BIVOR  reruns  and  the  accuracy  checks  will  be 
performed  on  ail  identity  matrices  from  BIVOR  reruns. 

23-31  I(l)=. 1E-3^.0001  -  accuracy  criterion  for  printout 

of  identity  matrices. 

32-40  I(2)=.15E-1=.015  =  accuracy  criterion  for  rejection/ 

acceptance  of  runs. 

41-80  FORM-S’/IO .0;  input  format  by  which  each  DATA  INPUT 
CARD  is  to  be  read  is  three  ten-column  fields  in 
the  F  format  starting  with  column  3. 
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Column 


Explanation 


3  1-20 

(first  card) 


IN(1,1)=1,  IN( 1 ,2)=2 ;  the  first  GCIV  (third 
independent  variable)  is  Z\ZV,  =  xs . 


21-40  IN(2, 1)=1 ,  IN(2,2)=1 ;  the  second  GCIV  (fourth 

independent  variable)  is  zx zx  =  z\  =  x4 . 

41-60  DJ(3,1)=2,  IN(3 ,2)=2 ;  the  third  GCIV  (fifth 

independent  variable)  is  zszs  =  z|  =  Xs . 

61-80  IN(4, 1)=1 ,  IN(4,2)=1,  IN(4,3)=2;  the  fourth 

GCIV  (sixth  independent  variable)  is 

2 

Z^ZjZj  =  Z I Zp  —  • 


3  1-20 

(second  card) 


IN(5, 1)  =  1 ,  IN(5,2)=2,  IN(5,3)=»2;  the  fifth 

GCIV  (seventh  independent  variable)  is 

2 

Z1Z2Z?  =  zlz2  ~  X7  . 


21-40  IN(6, 1)-1 ,  IN(6,2)=1,  IN(6,3)=1;  the  sixth 

GCIV  (eighth  independent  variable)  is 

3 

Zj  Z^Zj  —  Z ^  “  Xa  • 

41-60  IN(7,1)=2,  IN(7,2)=2,  IN(7,3)=2;  the  seventh 

GCIV  (ninth  independent  variable)  is 

Zg  Zg  Zp  3  Zp  ~  Xg  * 


IQ=4 i  IVOR  will  terminate  after  four  independent 
variables  have  been  ordered. 


3-5  MI=2;  the  independent  variables  are  to  be  divided 

into  two  groups  for  ordering  by  IVOR. 

6-8  NJ(1)=2;  the  first  two  independent  variables 

(*i  »xa)  etre  to  be  considered  as  the  first  group. 

9-11  NJ(2)»7;  the  next  seven  independent  variables 

(x3  ,x*  ,x:i  ,x.'  ,x7  »x«*  ,x«)  are  to  be  considered  as 
the  second  group. 

5  1-2  MB=3;  the  independent  variables  are  to  be  divided 

into  three  groups  for  ordering  by  BIVOR. 

3-5  L0T(l)-2;  the  first  two  independent  variables 

(xi»xa)  are  to  be  considered  as  the  first  group 
in  BIVOR. 
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Card  Type  Column 


Explanation 


5  6-8 


9-11 


6  1  -4 


5-8 


7  13-22 

(first  card) 

23-32 


7  13-22 

(second  card) 

23-32 


’  13-22 

(third  card) 

23-32 


8  3-12 

(first  card) 

13-22 


23-32 


8 

(second  card  thru 
twentieth  card) 


LOT(2)=3;  the  next  three  independent  variables 
(x3,X4,xs)  are  to  be  considered  as  the  second 
group . 

LOT(3)=4;  the  next  four  independent  variables 
(xg ,x7 ,Xe ,Xg)  are  to  be  considered  as  the  third 
group . 

IKEEPR(1)=4;  the  fourth  input  design  point 
(according  to  order  of  input)  is  to  be  used  as 
a  selected  input  design  point  for  the  calculations 
specified  in  column  14,  Card  Type  2. 


IKEEPR(2)=13 ;  the  thirteenth  input  design  point 
(according  to  order  of  input)  is  to  be  used  as  a 
selected  input  design  point  for  the  calculations 
specified  in  column  14,  Card  Type  2. 


The  value  of  the  first  OCIV  for  the  first 
synthetic  design  point  is  entered  (syn  zix  =  .240) . 

The  value  of  the  second  OCIV  for  the  first 
synthetic  design  point  is  entered  (syn  z2j  =  350). 


The  value  of  the  first  OCIV  for  the  second 
synthetic  design  point  is  entered  (syn  zia  »  .250). 

The  value  of  the  second  OCIV  for  the  second 
synthetic  design  point  is  entered  (syn  zaa  *  400) 


The  value  of  the  first  OCIV  for  the  third 
synthetic  design  point  is  entered  (syn  zt^  «  .260). 

The  value  of  the  second  OCIV  for  the  third 
synthetic  design  point  is  entered  (syn  z; «  450). 

The  observed  coordinate  of  the  dependent  variable 
for  the  first  data  point  is  entered  (yt  a  927). 

The  observed  coordinate  of  the  first  OCIV  for  the 
first  data  point  is  entered  (z, s  *  .253) . 


The  observed  coordinate  of  the  second  OCIV  for  the 
first  data  point  is  entered  (z„x  -  317). 

These  cards  are  written  in  the  same  format  as  the 
preceding  card. 
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Card  Type  Colusa; 


Explanation 


9  1-2  A  non-zero  value  is  entered  for  the  purpose  of 

indicating  termination  of  data. 

10  1  Lot  ( 1) =0 ;  the  constant  term  must  always  be 

retained  in  the  model. 


2  Lot  (2)=0;  the  first  independent  variable  (xj) 
is  included  in  the  model  for  this  rerun. 

3  Lot  (3)=1;  the  second  independent  variable  (x^j 
is  excluded  from  the  model  for  this  rerun. 

4  Lot  (4) = 1 ;  the  third  independent  variable  (x.)  is 
excluded  from  the  model  for  this  rerun. 

5  Lot  (5) =0 ;  the  fourth  independent  variable  (x.;) 
is  included  in  the  model  for  this  rerun. 

6  Lot  (6)=1;  the  fifth  independent  variable  (xi) 
is  excluded  from  the  model  for  this  rerun. 

7  Lot  (7)=1;  the  sixth  independent  variable  (x  ) 
is  excluded  from  the  model  for  this  rerun. 

8  Lot  (8)=i;  the  seventh  independent  variable  (x-.) 
is  excluded  from  the  model  for  this  rerun. 

9  Lot  (9)=0;  the  e5$hth  independent  variable  (xa) 
is  included  in  the  model  for  this  rerun. 

10  Lot  (10)al;  the  ninth  independent  variable  (x*) 
is  excluded  from  the  model  for  this  rerun. 
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VI.  COMPUTATION  AND  PRINTOUT 

VI. 1  Some  Basic  Computational  Features 

In  this  section  some  basic  computational  features  will  be 
discussed  which  merit  being  set  aside  from  the  description  of  the 
computational  details  given  in  Section  VI. 2.  The  discussion  of 
these  features  may  also  provide  a  better  understanding  of  the 
DA-MRCA  program  as  a  whole. 

VI. 1. a  Matrix  Inversion 


The  inverse  of  the  matrix  of  the  normal  equations  and 
the  solution  vector  are  obtained,  in  any  given  run,  by  the  Gaussian 
elimination  method  with  the  largest  element  as  pivot.  In  the  following, 
the  algorithm  is  outlined  for  the  interested  reader  who  prefers  a 
discussion  in  general  algebraic  terms  rather  than  interpreting  those 
parts  of  the  program  listing  (Section  VIII. 4)  which  represent  this 
inversion  procedure.  The  proof  for  the  validity  of  the  algorithm  is 
omitted  since  it  appears  to  be  beyond  the  scope  and  intent  of  the 
present  report.  A  proof  is  given,  for  example,  in  Cohen  [1959].  The 
inversion  subroutine  was  adopted  without  change  from  the  nucleus 
program  (TV-MRCA)  of  DA-MRCA. 

The  algorithm  is  described  in  terms  of  the  main  run, 
that  is,  as  applied  tc  the  (N+l)  x  (N+l)  matrix  of  the  normal 
equations  augmented  by  the  right-hand  vector  of  the  N+l  elements  EVy. 
However,  the  algorithm  is  identically  applied  also  to  all  reruns  with 
N'<N  independent  variables  contained  in  the  model. 

The  procedure  (as  discussed  for  the  main  run)  consists 
of  N+l  cycles,  after  each  of  which  all  (N+l)(N+2)  elements  involved 
will  have  changed.  The  elements  of  the  matrix  of  the  ith  cycle  are 
denoted  by  the  superscript  i  attached  to  the  elements  Evv*  and  EVy: 
lEvv» ,  xEVy.  By  definition,  i=0  indicates  the  original  element  . 

'Evv»  =  Evv‘  ,  °EVy  =  Evy;  v,v'  =  0,1,..., N.  At  the  end  of  cycle 
number  N+l,  the  elements  equal  those  of  the  inverse  matrix  A-1  and 
of  the  regression  coefficients,  respectively:  ,'4’1EVV»  =  cVv*  and 
N+1EVy  =  bv .  The  algorithm  is  as  follows: 

1st  Cycle  (i=l) 

(1)  The  square  matrix  A  of  the  normal  equations  with  rank 
N+l  is  searched  for  the  element  with  largest  absolute  value,  which  is 
found  on  the  main  diagonal.  This  element  is  called  the  pivot  element 
and  is  denoted  by  cEpp.  Row  p  is  called  the  pivot  row;  this  row  cannot 
be  used  as  the  pivot  row  in  any  one  of  the  remaining  N  cycles. 

All  subsequent  steps  (Nos.  (2)  -  (5))  of  the  1st  cycle 
are  exactly  like  steps  (2)  -  (5)  of  the  ith  cycle  as  described  below. 
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jth  Cycle 


(1)  The  square  matrix  of  rank  (N+l)-(i-l)  =  N+2-i, 
obtained  from  the  matrix  at  the  end  of  cycle  No*  i-1  by  deleting  all 
i-1  rows  and  columns  corresponding  to  the  pivot  elements  used  previously, 
is  searched  for  the  element  with  largest  absolute  value,  which  is  found 
on  the  main  diagonal*  This  element  is  the  pivot  element  of  the  it*1  cycle 
and  is  denoted  by  i“1Epp>.  The  corresponding  row  cannot  be  used  as  the 
pivot  row  in  any  one  of  the  remaining  N+l-i  cycles* 


(2) 


with 


i-lo*  , 

, 

i“Xr» 

kpP 


for  v1  =  0,1, *  *  * ,N, 


1-1 


i-ip*  , 
“pv* 


EpV#  if  v'^p 
1  if  v'=p 


(3) 


l-l, 


'py 


ILL 


l-l 


E 


pp 


(A) 


1  Ev  v  *  =  1*"aK?v»  -  1*1EVpiEpV» 


for 


v  =  0,1, * .  ,,p-l,p+l, *  *  *,N 
v'=  0,1, ... ,N 


with 


i-ip*  ,  _ 

E.VV'  ~ 


1_1Eyv*  if  v'/p 

0  if  v'=p 


(5) 


ip  -  1_i 

bVy  - 


Ei-lt?  i|J» 

Vy  *•  “V  p  apy 


for  v  ^ o 

(N+pth  Cycle 

The  computations  are  as  in  (1)  -  (5)  of  the  i^  cycle 
with  i=N+l.  The  results  are* 
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N  +  l 


N  +  l 


Eyv*  ~  Cyy» 


Eyy  —  by 


for  v, v '  =  0, 1,2, . . . ,N  . 


The  determinant,  A,  of  the  matrix  A  equals  the  product  of  the  N+l 
pivot  elements  of  the  N+l  cycles: 


N 

&  -  IT  'E„. 

1=0 

VI o l.b.  Checks  on  the  Accuracy  of  the  Ihverse  Matrix 

Vl.l.b.(l)  Introductory  Remarks 

The  accuracy  of  the  inverse,  A-1,  of  the  matrix  of 
the  normal  equations  of  a  given  run  with  K(-N)  independent  variables, 
which  is  obtained  in  DA-MRCA  by  the  modified  Gaussian  elimination 
process  as  described  in  the  previous  section,  depends  upon  the  natural 
limitation  of  the  computer  accuracy.  For  example,  in  the  IBM  7030,  13 
digit  accuracy  is  present  when  single  precision  is  used  as  in  DA-MRCA. 
The  limited  computer  accuracy  causes  the  propagation  of  errors.  Some 
contributing  factors  to  the  amount  of  these  errors,  as  contained  in  the 
elements  of  A-1,  are: 

(a)  the  rank  of  the  matrix  A; 

(b)  the  underlying  type  of  regression  problem  (for 
example,  polynomial  regression  vs.  ordinary  linear  regression  with 
original  independent  variables  only) ; 

(c)  the  ranges  of  the  values  of  the  independent 
variables  (for  example,  |  xv  |  >  1  vs „  j xv  |  <  1) j 

(d)  the  relative  position  of  the  nK  distinct  input 

design  points . 


In  general  (an  exception  is  discussed  in  Section 
VI.l.b.(3))»  the  only  practical  way  to  check  on  the  amount  of  the 
propagated  errors  contained  in  the  elements  of  the  inverse  A-1  is 
to  calculate  the  product 


Ic  =  A-1 A,  (VI-1) 

that  is,  to  form  a  "calculated  identity  matrix",  Ic ,  and  to  compare 
it  with  the  exact  identity  (or  unit)  matrix,  I.  This  is  done  in  the 
present  program  for  each  run  (in  BIVOR,  however,  only  when  specified, 
see  column  21  of  Card  Type  2,  Section  V.2).  The  checks  on  I0 ,  as 
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described  further  below,  not  only  serve  to  reject  unacceptably 
inaccurate  inverses  but  also  to  identify  cases  in  which  the  matrix 
of  the  normal  equations  contains  "obvious"  or  "non-obvious"  linear 
dependencies.  These  topics  are  further  discussed,  along  with  corrective 
measures  to  be  taken  in  such  rejection  cases,  in  Chapter  VII, 

When  Is  is  calculated  according  to  (VI-1),  it  is 
possible  that  the  errors  contained  in  the  elements  of  A  1  are  drastically 
magnified  such  that  the  off-diagonal  elements  of  Ic  are  far  from  zero. 
This  may  even  be  true  under  the  (unrealistic)  assumption  that  the 
elements  of  A-1  are  obtained  without  computational  errors,  except  for 
the  truncation  errors  due  to  the  natural  limitation  of  the  computer 
accuracy,  i.e.,  13-digit  accuracy  as  present  on  the  IBM  7030  with 
single  precision.  In  fact,  the' derivations  in  Section  VI,l,b.(2) 
below  are  based  on  this  assumption  that  the  elements  of  A  1  are  free 
from  error,  except  truncation  error.  The  main  diagonal  elements  of 
Ic  (which  should  all  be  1)  are  the  only  elements  of  Ic  which  will 
never  be  affected  by  this  type  of  magnifying  process.  Therefore,  the 
accuracy  check  on  Ic  is  restricted,  in  DA-MRCA,  to  the  main  diagonal. 

If  the  largest  deviation  from  1  in  the  main  diagonal  of  I5  exceeds  the 
input  value  of  1(2)  specified  by  the  program  user,  the  inverse  is 
automatically  rejected  by  the  program  as  being  unacceptably  inaccurate, 
(The  deviations  from  zero  of  the  off-diagonal  elements  of  Ic  are  also 
checked,  but  only  for  the  purpose  of  deciding  whether  or  not  the 
matrix  Ic  is  to  be  printed  for  visual  inspection,) 

The  justification  for  the  above  statements  is  given 
in  the  next  section  and  is  based  on  the  regression  model  ( I- 1)  as 
used  in  DA-MRCA,  If  the  model 


N 

y  =  y  +  I.  3v(xv  -  xv)  +  e,  (VI-2) 

v=l 


i,e.,  the  "adjusted"  regression  model,  were  used,  the  elements  (E^v*  , 
say)  of  the  matrix  of  the  normal  equations  would  also  be  adjusted  for 
the  averages,  e,g,,  fi’vv»  «  Evv«  -  SypEoy*  ,  and  a  different  situation 
(not  necessarily  an  improved  one)  would  arise  with  respect  to  the  error 
magnifying  process  when  calculating  an  identity  matrix.  See  the 
remarks  in  Section  VII, 2, a  concerning  the  effects  of  the  transformation 
(VII-1) ,  v  »  ££  . 

Rx 

VI.l,b,(2)  Justification  for  the  Rejection  Criterion 

In  this  section  a  justification  is  given  for  the 
rejection  criterion  (as  described  before)  which  involves  only  the  main 
diagonal  elements  of  I0  =  A-1A,  The  justification  is  given  under  the 
simplifying  assumption  that  the  elements,  cvv»  »  of  A"1  are  free  from 
error,  except  truncation  error.  It  will  be  shown  that  even  these 
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truncation  errors  in  the  cvv»  are  sometimes  sufficient  to  cause  large 
deviations  from  zero  in  the  off-diagonal  elements  of  Ie .  Naturally, 
these  deviations  are  even  larger  when  the  cVv*  also  contain  propagated 
errors,  as  is  almost  always  the  case  in  reality. 

All  errors  will  be  derived  in  terms  of  their 
approximate  "orders  of  magnitude."  For  this  purpose  the  following 
definition  is  introduced: 

Definition:  The  "order  of  magnitude"  of  a  number,  z,  is 
defined,  for  the  derivations  of  this  section,  to  be  the 
nearest  power  of  ten  to  which  z  can  be  rounded.  The  symbo* 

"8s"  is  used  to  indicate  that  the  number  or  algebraic  term 
located  to  the  right  of  the  symbol  is  the  order  of  magnitude 
of  the  term  located  to  the  left  of  the  symbol.  The  symbol 
is  also  applied  to  matrices,  and  its  meaning  shall  then 
be  that  the  matrix  to  the  right  of  the  symbol  is  the  matrix 
of  the  orders  of  magnitude  of  the  corresponding  elements  of 
the  matrix  to  the  left  of  the  svmbol. 

For  example,  for  z  =  677232: 

z  =  6  7  7  2  3  2  =  .7  x  106  ^  1  x  10s  =  106. 


Another  example  is: 

z  =  -0.0434  =  -0.4  x  10"1  s  -.1  x  10”1  =  -10"^ . 


The  approximate  orders  of  magnitude  of  the  truncation 
errors  contained  in  the  elements  of  A  1  and  of  the  errors  in  the  elements 
of  Ic  will  be  derived  for  the  case  of  the  main  run,  that  is,  for  A  being 
of  rank  (N+l)  x  (N+l)  .  Naturally,  the  results  are  similarly  valid  for 
the  matrix  A  of  any  rerun  with  N'<N  independent  variables. 


n 


With 

Ev  v* 

=  E 

xvixv*  < ,  the 

matrix  A 

the 

main  run  is 
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1 
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(VI-3) 
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The  elements  of  A**1  will  be  expressed  following  Cramer's  Rule.  This 
may  be  done  because  the  specific  characteristics  of  the  inversion 
process  of  DA-MRCA  and  the  associated  error  propagation  are  unimportant 
for  the  purpose  of  the  present  derivation.  To  repeat,  the  only 
purpose  is  to  show  the  magnifying  process  of  the  truncation  errors 
contained  in  A-1  which  can  take  place  when  Ic  =  A_1A  is  formed. 

To  arrive  at  the  justification  desired,  it  will 
further  be  necessary  to  make  use  of  a  known  result  from  the  theory 
of  determinants:  The  determinant  of  order  k, 

dll  d12  ' * ’  di k 

dsi  d22  •••  d21c 

•  •  • 

•  •  • 

•  •  • 
dkl  dk2  •••  dklt 

can  be  expressed  in  the  following  form: 

ki 

D  «  L  (±  diadaed3Y  dk„),  <VI'4> 


where  the  summation  extends  over  all  k!  members  which  result  from  the 
k*.  possible  permutations  of  the  subscripts  »•<  each  subscript 

taking  one  of  the  values  l,  2,  3,  . k. 


Applying  (V 1-4)  to  the  elements  cvy«  of  A“l  and 
recalling  that  Eyv*  *  By»v»  one  has  from  (VI-3)  according  to  Cramer's 
Rule: 

iNi  ^ 

cvv*  •  A  E  <«oi0Kui*ata  ***Biij  *  *  *  B(v-i)iv-xB(v*i)tv#i 


with 


ij  *  0,1,2,  .«*,  (v'-l),(v'+l),  •<>,  Hj  ij^ij*  » 


(VI-5) 


where 

(m+d: 

A -Dot (A)  *  t  (iSotoSMlBate  *••«}!,  *"R¥tv  «e»,) 
with 

ij*  0,1,2,  .*.,  M;  ijMj*  • 

The  sum  in  (VI-5)  consists  of  an  even  number  (Ml)  of  members  with 
alternating  signs.  Accordingly,  the  truncation  error  of  this  sum  (or 
that  of  cvv»  A)  should  have  an  approximate  order  of  magnitude  equal  to 
that  of  the  truncation  error  of  the  absolutely  largest  one  of  the  MS 
members.  However,  the  largest  member  cannot  generally  be  defined,  but 
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an  upper  bound  for  it  can  be  determined  by  the  application  of  Schwartz's 
inequality.  This  upper  bound  equals 


U  = 


N  \ 

IT  En 
j=0 

j^v,v'  / 


*/Ey  y  Ey®y* 


(VI-7) 


It  will  be  demonstrated  later  that  it  is  not  unrealistic  to  use  this 
upper  bound  for  the  largest  of  the  N!  members  in  cvv*  A  because  the  use 
of  U  and  of  the  value 


U* 


(VI-8) 


both  lead  to  the  same  approximate  results.  But  U'  is  indeed  one  oi  the 
N!  members  in  the  sum  of  (VI-5). 

In  order  to  illustrate  the  derivation  of  formulae 
(VI-7)  and  (VI-8),  the  term  cvv» A  is  evaluated  for  the  example  case  of 
v=2,  v'=3,  and  N=3:  For  this.  example,  in  none  of  the  members  in  the 
sum  (VI-5)  is  there  an  E-term  having  as  its  first  subscript  u=2  or  as 
its  second  subscript  v'=3.  Disregarding  the  signs,  the  6  members  of  the 
sum  are: 


1. 

^00^11^33  > 

2. 

^00^13^3 1 » 

3.  E0iE10E3a, 

4.  EolEiaBao, 

5. 

EoaEnE3o» 

6. 

EoaEioBsi. 

The  first  of  these  members  is  the  one  which  was  generally  denoted  as  U' 
in  (VI-8)  .  Recalling  that 


Ev*v  =  E  Xy^xv 
i=l 


or  shorter, 


*  r  X/  Xy  , 

Schwartz's  inequality  shows  that 

Ey^p  a  E  Xy*  Xy  ?£  Xy*  I  Xy  ■  /Ey^(  Ey  y  . 


Therefore,  and  according  to  (VI-7),  the  value  U  »  B00Bn/fiaaE33  is  also 
an  upper  bound  for  U'  ■  E00EllE3a .  (As indicated  before  for  the  general 
case,  both  values  £00^11^33  and  E0oEn/&38&A3  will  lead  to  the  same 
results  with  respect  to  the  approximate  orders  of  magnitude  of  the 
truncation  error  of  ca3.)  To  show  the  validity  of  the  upper  bound  U  for 
one  more  member  of  the  six,  take  the  fourth:  E0lBlaB30.  Here  one  has 
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-  £  xOxl  ^  xlx2  x3xO  xo  Z  X1  t-  X1  l*  x2  x3  *-  xf 


.2  - 


-  Z  Xq  L  Xj  /I414  3  E00Ea  j/EasEga  . 


Continuing  the  main  derivation,  the  truncation  error 
of  cv^  will  be  called  5(cVvO  and  expressed  as  10-Hcvv»,  where  the  exponent 
H  (>  0)  is  left  unspecified  for  the  time  being.  Therefore,  replacing 
the  sum  in  (VI-5)  by  the  term  (VI-7)  (which  substitution,  according  to 
the  argument  used,  is  possible  only  under  the  simultaneous  multiplication 
of  both  sides  of  (VI-5)  with  10~M)  one  gets: 


6(cv  v*) 


10’Hcwv» 


»  \ 

n  e,, 

j-o  I 

j^V,V*  / 


(VI -9) 


Here,  it  is  sufficient  to  replace  A  by  its  approximate  order  of 
magnitude.  This  can  be  set  equal  to  the  order  of  magnitude  of  the 
product  of  the  main  diagonal  elements  of  A, 


N 

11  Bi  j  * 

j»0 


which  is  the  largest  member  in  the  sin  of  the  (N+l)!  members  in  (VI -6)  . 
In  doing  so,  therefore,  one  actually  replaces  A  by  an  upper  bound  which 
results  in  a  lover  bound  for  the  order  of  magnitude  of  the  truncation 
error  of  cww*  : 


(VI -10) 


However,  if  the  lower  bounds  of  the  truncation  errors  are  able  to  cause 
the  large  deviations  in  the  off-diagonal  elements  of  I,  (as  will  be 
shown),  these  deviations  are  in  reality  even  larger  for  the  true  trun¬ 
cation  errors  In  the  cvv*. 

The  element  lvv*  of  Ic  *  A"1  A  is  obtained  as 
N 

ivv»  *  I  cvy*Rv*v  .  (VI-11) 

V*«0 
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where  cvv*  is  the  true  vaiue  of  the  inverse  element  (i.e.,  a  value  free 
from  truncation  error  and  any  other  error)  and  where  6(cvv*)  is  the 
truncation  error  of  cvv*  as  defined  before,  one  has  from  (VI-11): 


N 

iy y*  ®  Z  {cyy^P  6(CyV*)  }Ey*y* 

O*=0 


l0  + 


N 

Z  {6(Cyy*)  }Ey*y 

v*=0 

N 

Z  [6(Cyy*)  }Ey*y» 

V*=0 


if  v'=v 


if 


This  leads  to  the  definition  of  the  error  of  ivv»  caused  by  the 
truncation  error  of  cvv»: 


N 

6(iyy#>  =  Z  {6(Cyy*)}Ey*y.  .  (VI-12) 

o*=0 

(Notice  that  this  derivation  implied  the  assumption  of  no  additional 
truncation  errors  being  introduced  when  forming  ivv»  .) 

Inserting  (VI- 10)  into  (VI-12)  one  has: 

N  H 

5(ivv.)  ^  Z  n  Ev*v»  •  (VI-13) 

/EyyEy^v* 


At  this  point  it  is  necessary  to  introduce  another  approximation. 

Since  only  orders  of  magnitude  are  considered,  it  appears  sufficient 
to  put,  in  general, 

n 

Kyy*  *  Z  Xy|Xv*t  ^  O  XyXy*  .  (VI-14) 

ini 

Substituting  these  orders  of  magnitude  in  (VI-13),  one  gets 

6(iVy*  -  (N+l)l<r*-5r“  .  (VI-15) 

V*n 0  '  V 
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An  identical  resul  t  is  obtained  when  the  term  (VI-8)  , 
U',  rather  than  (VI-7),  U,  is  used  to  replace  the  sum  in  (VI-5).  In 
this  case  one  has,  instead  of  (Vl-iO),  for  the  truncation  error  of  cvv*: 


^'(Cv/ 

This  leads  to  the  error  of  ivv*  , 


x  10'hEvV 
)  -  • 

Ev 

corresponding  to  (VI-13): 


suivv*)  -  i  io"H  Ev*v£y*v:  . 

\^f=0  EyyEy*y* 

Usirg  again  the  approximation  (VI-14),  one  has  6*( iy v«)  6(ivv*)>  which 

was  to  be  shown. 


Finally,  using  (VI-15),  the  matrix  6 ( Ic )  of  the 
approximate  orders  cf  magnitude  of  the  errors  in  Ie ,  caused  by  the 
truncation  errors  5(cVy)  only,  is  obtained: 


X? 

•  *  *  v* 

xv 

*  *  *  Xy*  *  *  • 

XN 

l 

X? 

...  Xy 

Xy» 

•  •  «  *  •  • 

X* 

Xt 

Xi 

Xl 

_1_  xi  l 

x8 


Xa 


(1^1)10“* 


hi  h.  i 

*«  KV 


i  x'w-  ,  jv  .  :  x« 

*, S *  m  *#***■  **♦.  ■  <1  <•’»  »  »  t  t  ’ST-* 

Xv*  Xy*  Xv«  Xy»  Xy» 


(VI-16) 
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The  formulation  (VI-16)  shows  the  following:  (a)  the  orctars  of 
magnitude  of  the  errors  in  the  main  diagonal  elements  of  Itt  (caused 
only  by  truncation  errors  in  the  cvv»)  are  approximately  (N+1)10""H 
and  are,  therefore,  independent  of  the  numerical  values  of  the 
independent  variables;  (b)  the  orders  of  magnitude  of  the  errors  in 
the  off-diagonal  elements  of  Ie  are  approximately  the  orders  of 
magnitude  of  the  ratios,  multiplied  by  (N+l)10-H,  of  the  averages  of 
the  independent  variables  as  given  in  the  matrix  and  are,  therefore, 
dependent  upon  the  numerical  values  oi  these  independent  variables; 

(c)  the  approximate  orders  of  magnitude  of  the  errors  of  the  off-diagonal 
elements  of  Ic  are  reciprocal  with  respect  to  the  main  diagonal  (apart 
from  the  factor  (N+l)  1(TH),  viz  . , 

S(ivv»)  ®  (N+l)  10"  H  versus  5(iv s  (N+l)  10~H  • 

xv  xv* 


According  to  these  findings,  an  off-diagonal  element  of 
Ic  can  appear  to  be  so  much  in  error  that  it  is  not  even  in  the  vicinity 
of  zero.  This  is  particularly  likely  to  happen  when  one  deals  with 
polynomials.  For  example,  in  a  polynomial  in  one  independent  variable 
x,  the  term 


xv 


—  E  x! 


n 


i»l 


can  be  rather  large  when  j  x  |  >  1  and  the  exponent  v  is  sufficiently 
large . 


If  the  order  of  magnitude  of  xv  is  called  10M ,  then 
the  error  of  iov  is,  for  example,  according  to  (VI-15): 


6(i0v)  sg  (N+l)  10~wxv  ^  (N+1)10~h  +  h. 

If  M  is  approximately  equal  to  H,  the  apparent  deviation  of  iov  from 
zero  can  be  considerable,  and  it  is  obvious  that  this  deviation  can  be 
large  even  if  the  matrix  inversion  was  perfectly  accurate  within  the 
natural  limitations  of  the  computer  accuracy. 

The  following  simple  example  was  actually  computed 
with  DA-MRCA  in  order  to  illustrate  what  has  been  shown  theoretically. 
The  numbers  displayed  are  taken  from  the  program  output.  There  is  only 
one  independent  variable  in  the  example,  and  its  5  distinct  numerical 
values  were  chosen  extremely  large  in  order  to  emphasize  the  effect. 

The  ,x  values  are  as  follows  (written  in  the  exponential  format): 


62 


NWL  REPORT  NO.  2035 


.39062500E+14 
.26435638E+15 
. 7549332 1E+ 15 
.39721133EH6 
. 10000000E+17 

The  matrix  A  is  accordingly: 


r.50000000E+01 
A  =  I 

.  15J30465E+I7 

From  this,  DA-MRCA  computed  A*"1: 


A_i 


and,  finally: 


Ic  =  A‘XA 


”  . 3268542 7E -00 

-  .42199048E-16 

— 


.  10000000E+01 
- .  78886091E -30 


.  15030465E+17 
.11641902E+33 

-  .42199048E-16 
.14037838E-31 

.64000000E+02 
. 10000000E+01 


The  deviation  of  i01  from  zero  is  64,  that  is,  the  apparent  error  of 
i01  has  an  order  of  magnitude  102  .  According  to  (VI-15),  the  error  of 
i01  should  have  an  approximate  order  of  magnitude  equal  to  that  of 
(N+l)  x  10~Hx1 .  The  average  of  the  5  levels  of  xx  =  x  is  x  =  .30060931E+16 . 
Therefore,  the  error  of  igX  should  have  an  approximate  order  of  magnitude 
of  2(10"h)(.3)10+16  as  101®_H  .  With  H=14  for  the  IBM  7030  (single  precision), 
the  apparent  order  of  magnitude  of  the  error  of  i01  equals  the  one  theoreti- 
c  illy  predicted:  6 ( i0 1 )  83  I016-i4  =  102  .  Equally  interesting  is  the 
apparent  order  of  magnitude  of  the  error  of  i10  =  -  .73886091E-30  which  is 
10~3°  if  one  neglects  the  negative  sign.  According  to  (VI-15),  the 
approximate  order  of  magnitude  of  the  deviation  of  i10  from  zero  should 
he  that  of  2(10”h)  •  which  is  10~H10~16  =  10“29  with  H=14.  This 
approximation,  therefore,  is  almost  as  good  as  the  one  for  6(i01)  . 

Finally,  the  errors  in  the  main  diagonal  elements  of  Ie  should  have 
orders  of  magnitude  equal  to  that  of  2(10“14)  which  cannot  be  observed 
since  only  8  digits  are  printed  by  the  program.  Obviously,  in  this 
case,  the  good  agreement  between  the  predicted  and  apparent  orders  of 
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magnitude  of  the  errors  in  I.  is  due  to  the  small  rank  o£  the  matrix  A. 

It  can  be  assumed  that  propagation  errors  are  practically  absent  when  a 
matrix  of  rank  2,  as  in  this  example,  is  inverted.  In  this  case,  there¬ 
fore,  the  apparent  errors  in  the  elements  of  I,  should  essentially  be 
the  magnified  truncation  errors  of  the  cvv*,  the  approximate  orders  of 
magnitude  of  which  are  given  by  (VI-16)  . 

It  should  be  noted  that  the  errors  of  the  off-diagonal 
elements  of  I,  might  appear  to  be  large  not  only  when  the  x  values  are 
very  large  (and  of  equal  sign)  as  in  the  above  example,  but  also  when 
the  x  values  are  very  small  (and  of  equal  sign).  If  the  latter  is  the 
case,  the  deviations  from  zero  of  the  elements  in  the  lower  half  of  I, 
will  be  very  large. 

The  only  way  to  guarantee  that  the  errors  of.  all 
elements  of  I.  will  be  of  equal  order  of  magnitude  (i.e.,  (N+l)  x  10”  ) 
would  be  to  apply  a  standardizing  transformation  to  the  x  values,  such 
as  v  =  — —  which  is  discussed  in  Section  VII. 2. a.  With  R,  ~  max(x)  - 
min(x) ,  *  this  transformation  results  in  average  values  of  the  independent 
variables  which  have  an  approximate  order  of  magnitude  1,  and  this,  as 
can  be  seen  from  (VI-16) ,  leads  to  the  uniformity  of  thr  orders  of 
magnitude  of  the  errors  in  all  elements  of  Ic .  Only  in  this  case, 
therefore,  would  it  make  sense  to  check  the  accuracy  of  all  elements  of 
Ic ,  or,  preferably,  of  all  elements  of  the  residual  matrix  Ic-I.  For 
this  situation  a  measure  like  the  Euclidean  norm  could  be  used  to 
check  the  accuracy  of  I, -I  and,  thereby,  the  accuracy  of  the  inverse 
matrix. 

_  However,  as  is  shown  in  Section  VII. 2. a,  the  trans-  x 
formation  can  be  very  undesirable  for  the  program  user  in  certain 
situations  . '.-It  is  essentially  for  this  reason  that  in  DA-MRCA  the 
accuracy  checks  on  the  identity  matrix  are  restricted  to  its  main- 
diagonal.  Since  all  (N+l)s  elements  of  A"1  are  involved  in  this  check, 
it  is  felt  that  by  this  check  the  program  user  is  sufficiently  protected 
from  inaccurate  or  fictitious  inverses. 

In  connection  with  the  results  of  the  present  section, 
the  reader  is  referred  to  an  example  case  of  a  5th  order  polynomial 
which  is  also  given  in  Section  VII. 2. a.  In  this  example,  the  off- 
diagonal  elements  of  I;.  deviate  from  zero  to  a  much  larger  extenc  than 
indicated  by  (VI-16),  which  is  in  accordance  with  the  assumptions  leading 
to  (VI-16) .  The  deviations  practically  vanish  when  the  x  values  are 
"coded,"  i.e.,  when  the  transformation  v  =  x~x  is  applied. 

R* 

VI . 1  ,b , (3)  The  Choice  of  1(1)  and  1(2) 

Restating  from  Section  Vl.l.b.(l),  the  program 
rejects  an  inverse  as  unacceptably  inaccurate  when  the  largest  deviation 
ft om  1  in  the  main  diagonal  of  Irt  exceeds  a  value,  1(2),  specified  by 
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the  program  user.  As  to  the  choice  of  1(2),  extensive  studies  have 
been  made  by  the  authors.  One  method  which  was  applied  to  find  a 
direct  relation  between  the  maximum  deviation  of  the  ivv  from  1  and 
the  accuracy  of  the  inverse,  was  the  computation  of  perfect  fit 
regression  cases.  In  these  cases  the  regression  sum  of  squares,  as 
computed  by  using  the  elements  of  the  inverse,  via  the  regression 
coefficients: 


N 

ASSRm  =  I.  bvEv  y  -  I  E? ...  , 
v=0  n 

can  be  compared  with  its  hand-computed  equivalent.  (This  is  the 
exceptional  case,  mentioned  in  Section  Vl.l.b.(l),  in  which  the 
accuracy  of  the  inverse  can  independently  be  checked.)  The  results 
from  the  calculated  example  cases  confirmed  the  experience  gained  by 
the  authors  in  many  problems  previously  solved  with  DA-MRCA:  The 
chosen  value  of  1(2)  should  lie  between  0.001  and  0.01,  depending 
upon  the  rank  of  A.  With  this  choice  the  analyst  can  be  confident 
that  inaccurate  or  fictitious  inverses  will  be  rejected  by  the  program 
and  that,  in  general,  sufficiently  accurate  inverses  will  not  be 
rejected . 


Since  the  analyst  might  sometimes  wish  to  visually 
inspect  the  whole  calculated  identity  matrix,  DA-MRCA  provides  for  the 
possibility  of  printing  it.  The  decision  of  whether  or  not  to  print 
I0  is  made  by  the  program:  only  when  none  of  the  elements  of  I5 -I  is 
in  error  by  more  than  a  value,  1(1),  specified  by  the  program  user, 
will  I.,  not  be  printed.  The  reason  for  this  device  is  twofold: 

(a)  If  Ia  is  not  printed,  the  user  knows  at  once  that 
all  errors  are  smaller  than  1(1)  . 

(b)  If  the  user  is  not  interested  in  the  inspection 
of  Ic ,  he  can  possibly  choose  1(1)  so  large  (but  not  larger  than  1(2)) 
that  in  most  cases  I,  will,  in  fact,  not  be  printed,  whereby  printout 
and  printing  ti^ne  of  the  whole  regression  problem  will  be  reduced.  If 
he  chooses  1(1)  =  1(2),  he  will  get  a  printout  of  I„  only  in  rejection 
cases . 


Occasionally  the  program  user  wants  every  identity 
matrix  printed.  He  can  achieve  this  by  putting  1(1)  =  0.  Otherwise, 
the  choice  of  the  value  of  1(1)  roust  be  left  to  the  user.  For  the 
purpose  of  acquainting  the  user  with  the  program,  concerning  the 
behavior  of  I„ ,  the  experience  of  the  authors  showed  that  a  value  of 
1(1)  in  the  vicinity  of  10“4  should  be  chosen. 
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VI.l.c  Chi-Square  Test  on  Normality  of  Residuals 

Significance  tests  based  on  the  main  theorem  of  multiple 
regression  (Section  III.l)  and  the  construction  of  confidence  intervals 
require  normality  of  the  distribution  of  the  residuals  e  in  the  model 
(1-1)  .  The  only  way  to  test  the  hypothesis  of  normality  is  to  examine 
the  distribution  of  the  ''estimated"  residuals,  e{  =  y.  -  Y<  ,  i  =  l,...,n. 
This  is  done  in  the  present  program  by  the  Chi-square  test.  One  should, 
however,  remember  that  the  F  test  (III-l)  of  the  main  theorem  is  rather 
robust  with  respect  to  the  form  of  the  distribution  of  the  residuals. 
Therefore,  unless  striking  evidence  of  non-normality  is  shown  by  either 
the  bar  chart  of  the  frequency  distribution  of  the  e,  or  the  computed 
Chi-square  value,  or  both,  the  analyst  would  not  be  too  concerned  about 
the  hypothesis  testing  aspects.  For  interval  estimation,  however, 
normality  as  demonstrated  by  the  ej  is  essential. 

Both  the  bar  chart  and  the  Chi-square  value  (if  it  can 
be  computed)  should,  therefore,  be  considered  merely  as  aids  to  determine 
whether  a  transformation  of  the  observed  values  of  the  dependent  variable, 
y,  would  be  necessary  or  helpful  to  achieve  normality  or  approximate 
normality  of  the  residuals.  Also,  the  possible  significance  of  the 
computed  Chi-square  value  should  not  be  taken  too  literally.  The 
Chi-square  test  for  normality  is  only  an  approximation,  and  the  number 
of  degrees  of  freedom,  m-K-3,  obtained  by  subtracting  the  number,  K+2, 
of  parameters  estimated  (K+l  regression  coefficients  i lus  the  standard 
deviation  in  case  of  a  model  containing  K  IV's)  from  m-1,  where  m  is 
the  final  number  of  intervals,  certainly  is  a  safe  lower  limit. 

The  fixed  number  of  30  initial  intervals  into  which  the 
observed  range  of  the  residuals  is  partitioned  also  deserves  some 
discussion.  As  outlined  in  more  detail  in  Section  VI. 2. a. (3),  the 
Chi-square  subroutine  automatically  arrives  at  a  new  partitioning  of 
the  range  into  m  30  intervals  by  combining  subsets  of  the  30  initial 
intervals  into  m  new  intervals  such  that  each  one  of  the  m  has  an 
expected  number  of  more  than  5  observations.  Tile  initial  number  of 
30  Intervals  was  chosen  as  a  compromise  to  avoid  the 
extreme1*  of:  (1)  having,  in  most  runs,  few  expected  residuals  (little 
more  than  five)  in  each  of  the  final  m  Intervals,  and  (2)  having,  in 
most  runs,  too  small  a  number  m  such  that  the  degrees  of  freedom  of 
Chi-square,  m-K-3,  would  be  r.on-poslt Ivc . 


VI.  1  .d  IVOR 

In  this  section  the  basic  steps  of  the  computational 
procedure  of  IVOR  ("independent  Variable  Ordering  by  Regression  sums  of 
squares")  are  explained.  The  principles  of  this  ranking  method  and  its 
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applications,  along  with  those  of  BIVOR,  were  already  discussed  in 
Section  III. 2;  whereas  the  computational  details  in  the  subroutine 
IVOR  (including  the  relevant  checks  for  the  acceptability  of  a  rerun 
and  internal  decisions  based  on  these  checks)  are  given  in  Section  VI. 2 

The  N  independent  variables  (OCIV's  and  GCIV's,  or  OCIV's 
only)  in  the  preconceived  model  of  a  regression  problem  are  optionally 
divided  into  Mj  consecutive  groups  according  to  the  IV  input  sequence, 
with  Nj  independent  variables  in  the  respective  groups,  j  =  l,...,Mj. 
The  primary  purpose  of  the  grouping  option  is  to  allow  the  possibility 
of  ranking  IV 's  under  "restricted  admissibility."  (This  type  of  ranking 
has  several  applications  as  discussed  in  Sections  II. 3  and  VII. 2. a.) 
Another  use  of  the  grouping  feature  is  as  a  device  to  save  computing 
time;  see  the  remarks  at  the  end  of  this  section  (Vl.l.d).  Not  all 
N  IV's  in  a  given  regression  problem  need  be  included  in  the  grouping. 
If  the  total  number. 

Mi 

I  N., 

3=1  ‘ 

of  the  independent  variables  in  the  Mj  groups  is  less  than  N,  the  last 
(or  rightmost) 

MI 

N  -  £  N< 

j-1 


independent  variables  are  excluded  from  the  IVOR  ordering.  If  the  user 
does  not  want  to  use  the  grouping  at  all,  he  should  put  all  IV's  in  one 
group,  i.e.,  let  M,  =  1  and  Nj  »  N.  (See  input  preparation  for  Card 
Type  4,  Section  V.2.) 

IVOR  starts  the  ordering  within  the  first  (or  leftmost) 
group  of  N)  IV's  and,  after  having  completed  the  ordering  within  that 
group,  proceeds  to  the  second  group  and  further  to  the  right  until  the 
ordering  is  completed  within  all  Mj  groups. 

For  the  present  description  only,  the  IV's  of  group  j, 
j  <  l,...,Mi,  are  denoted  by  x*j),  h  •*  l,...,N..  With  this  notation, 
the  first  Nt  steps  of  IVOR  are: 

First  Step.  Each  of  the  Nt  IV's  of  the  first  group 
(x‘:n;  h  *»  l,..., Mi)  is  included  in  the  model,  one  at  a  time,  as  the 
only  Independent  variable  in  the  model.  For  each  IV  the  ASSR  value 
(Regression  Sum  of  Squares  Adjusted  for  the  maan)  is  computed.  Among 
these  Mi  ASSR  values  the  maximum  is  found  and  the  independent  variable 
whose  inclusion  in  the  model  led  to  the  maximum  is  denoted  as  x{|) 
Accordingly,  X(\]  is  considered  as  the  most  important  IV  in  the  first 
group. 
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Second  Step.  Each  of  the  N^l  IV's  of  the  first  group 
which  have  not  yet  been  ranked  (x^;  h  -  1 , . . .  but  4  (1))  is 
included  in  th".  model,  one  at  a  time,  together  with  x[  ^  ] ,  the  IV 
ranked  most  important  in  the  first  step.  That  is,  in  the  second 
step,  the  model  always  contains  two  IV's,  of  which  one  is  x[^. 

Then  the  Nj-1  ASSR  values  due  to  the  N^l  sets  of  two  IV's  are 
computed  and  the  maximum  is  found.  The  independent  variable  which, 
in  union  with  led  to  this  maximum  ASSR  value  is  denoted  as 

and  is  considered  as  the  second  most  important  IV  in  the  first 
group . 


Third  Step.  Each  of  the  Nx-2  IV's  of  the  first  group 
which  have  not  yet  been  ranked  (xj;15;  h  =  1,...,N],  but  4  (1)  and  (2)) 
is  included  in  the  model,  one  at  a  time,  together  with  x\\]  and  x^jj). 
Then  the  Nj-2  ASSR  values  due  to  the  Ni-2  sets  of  three  IV's  are 
computed  and  the  maximum  is  found.  The  independent  variable  which, 


together  with  x|i]  and  X(a],  led  to  this  maximum  is  denoted  as  x[ 3  j 
and  considered  as  the  third  most  important  IV  in  the  first  group. 


Step  4  to  Step  Nx .  The  procedure  is  continued, 
corresponding  to  Steps  1-3,  until  x^.j)  is  found  in  Step  Ni-1.  In 
step  Ni,  the  remaining  IV  in  the  first  group  is,  naturally,  considered 
to  be  the  least  important  one  and  is  denoted  as  x^ ) . 

The  remaining  steps  of  IVOR  are  as  follows: 

Step  Ni+1 .  Each  of  the  Na  IV's  of  the  second  group 
(xj;ai;  h  =  l,--.»NaJ  is  included  in  the  model,  one  at  a  time,  together 
with  all  Ni  IV's  of  the  first  group.  Then  the  Na  ASSR  values  are 
computed,  each  one  due  to  Nj+1  IV's.  Among  these  Na  ASSR  values  the 
maximum  is  found  and  the  independent  variable  of  the  second  group 
whose  inclusion  in  the  model  led  to  this  maximum,  is  denoted  as  xjfj. 
This  IV  is  considered  as  the  most  important  Independent  variable  in 
the  second  group. 

Steps  (Hj+2)  to  (Nj+Ha)  follow  correspondingly. 

The  procedure  is  continued  with  the  third  group,  fourth 
group,  etc.,  until  ell  independent  veriebles  in  ell  groups  heve  been 
ranked. 

The  procedure  thus  described  mey  be  celled  the  "stenderd" 
IVOR  procedure.  However,  since  the  number  of  metrlx  Inversions  end 
reinvent  computet ions  performed  by  the  "stenderd"  IVOR  routine  mey 
result  in  excessive  computer  time,  an  Input  parameter,  IQ  (columns  1 
end  2,  Card  Type  4),  is  available  for  possible  use  in  lifting  the 
number  of  IV *s  to  be  ordered  by  IVOR.  If  IQ  >  0,  only  the  IQ  most 
important  independent  variables  will  be  found,  i.e.,  ordered  by  IVOR 
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under  this  option,  and  the  N-IQ  least  important  IV's  will  not  be 
ordered  at  all.  IQ  must  fulfill  the  inequality 

Ml 

IQ  <  Z  NJ} 

j=l 

but  can  otherwise  be  chosen  freely,  such  that,  for  example,  the 
ordering  may  cease  after  some  IV's  of  a  given  group  and  all  IV's  in 
the  previous  group(s)  have  been  ordered.  For  example,  with 


J* 

IQ  -  Z  Nj  +  3, 

j  =  l 

J* 

where  Nj*+1  >  3,  say,  IVOR  will  first  order  the  I  Nj 

independent  variables  in  the  first  j*  groups  as  described  above.  Then 
it  will  find,  among  all  Nf*+1  IV's  of  group  j*+l,  the  three  most 
important  ones  in  the  usual  manner  and  cease  ordering.  The  last 
Nj*+1-3  IV's  in  group  j*+l  and  all  IV's  in  the  subsequent  groups  will 
be  left  unordered. 

Two  remarks  should  be  made  with  respect  to  the  grouping 
feature  in  the  IVOR  procedure. 

The  first  concerns  its  use  as  a  means  to  rank  IV's  under 
restricted  admissibility.  Namely,  the  sequence  in  which  the  IV's, 
especially  GCIV's,  are  input  to  the  program  is  critical  when  the 
grouping  option  is  exercised  for  this  purpose.  Since  the  allocation 
of  the  IV's  to  the  various  groups  is  performed  according  to  the  input 
sequence,  it  is  necessary  to  input  first  all  those  IV's  which  would  be 
admissible  for  ranking  at  the  first  step  of  IVOR  and,  therefore,  would 
define  the  first  group.  In  general,  these  would  be  the  OCIV's,  that  is, 
IV's  with  a  powersum  of  1.  In  general,  all  IV's  with  a  powersum  of  2 
would  follow  next,  that  is,  all  GCIV's  representing  terms  of  second 
order;  etc.  In  other  words,  the  GCIV's  would  have  to  be  specified  in 
the  sequence  indicated  in  the  example  given  for  Card  Type  3  (see 
Section  V.2) . 

The  second  remark  concerns  the  use  of  the  grouping  feature 
as  another  device  (along  with  the  IQ  feature)  to  save  computing  tiae. 
One  such  time  saving  effect  is  achieved  by  specifying 

*1 

Z  Nj  <  N, 
j*l 


69 


NWL  REPORT  NO.  2035 


provided  the  user  is  willing  to  save  time  by  not  ranking  the 

N  -  £  N, 

j=l  ' 

rightmost  IV's.  Also,  the  user  can  group  the  IV's  by  some  preconceived 
scale  of  importance  which,  in  case  of  GCIV's  being  present,  may  or  may 
not  be  the  grouping  required  for  ranking  under  restricted  admissibility. 
Computing  time  is  saved  because  the  IVOR  ordering  always  takes  place 
within  only  one  group  at  a  time,  which  leads  to  fewer  matrix  inversions 
and  relevant  computations  than  would  be  necessary  when  the  IV's  were  not 
grouped.  Again,  the  user  has  to  specify  the  input  order  of  TV's  such 
that  this  grouping  by  preconceived  importance  is  possible.  When  choosing 
time  saving  devices  in  IVOR,  the  user  should  clearly  distinguish  between 
the  consequences  of  using  IQ  and  the  grouping  feature. 

The  program  user  should  be  aware  that  whenever  he  applies 
the  grouping  feature  (with  Mj  >1),  IVOR  will  give  a  ranking  of 
independent  variables,  by  prediction  power  for  the  dependent  variable, 
within  only  the  designated  groups  of  IV's.  This  ranking  may  be  called 
"sub-ranking",  in  contrast  to  the  ranking  when  all  IV's  are  considered 
to  be  in  one  group  (Mj  =  1) .  (See  also  the  discussion  of  the  ranking 
results  for  the  example  problem  in  Section  VI. 5.) 


Vl.l.e  BIVOR 


The  computational  procedure  of  BIVOR  ("Backward  Independent 
Variable  Ordering  by  Regression  sums  of  squares")  is  based  on  principles 
similar  to  those  of  IVOR  which  were  discussed  in  the  last  section.  In 
the  present  section,  therefore,  the  essential  steps  of  BIVOR  are  given 
while  reference  is  often  made  to  Section  Vl.l.d. 

The  optional  grouping  of  IV's  is  done  in  the  same  manner 
as  in  IVOR;  however,  the  numb e 5  Mb ,  of  groups  in  BIVOR  and  the  numbers, 
N, ,  of  IV's  in  the  groups  (q«l,...,MB)  may  be  different  from  Mj  and  the 
N«  of  IVOR,  respectively,  when  both  options,  IVOR  and  BIVOR,  are 
exercised.  Also  in  BIVOR,  the 

Hj 

N  -  IN, 
q*»l 

rightmost  IV's  may  be  excluded  from  the  ordering.  As  to  the  use  of  the 
grouping  feature  in  BIVOR,  see  the  remarks  at  the  end  of  the  present 
section. 
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For  the  following  description  it  will  be  assumed  that 


Me 

2  N,  =  N, 

q=i 


which  does  not  affect  the  general  validity  of  the  description.  BIVOR 
starts  the  ordering  within  the  last  (or  rightmost)  group  of  NM  IV's 
and,  after  having  completed  the  ordering  within  that  group,  proceeds 
to  the  next  to  last  group  and  further  to  the  left  until  the  ordering 
is  completed  within  all  Ma  groups.  In  more  detail,  the  first  NM  =N„ 
steps  of  BIVOR  are  as  follows.  (For  clarity  and  for  the  rest  of  the 
present  section  only,  the  subscript  "B"  (for  BIVOR)  will  be  eliminated 
from  all  terms  such  that  MB  becomes  M  and  NM  becomes  NM .) 

First  step.  All 

M 

n,  =  n 

q=l 

independent  variables  are  included  in  the  model  and  the  corresponding 
matrix  of  the  normal  equations  is  inverted.  Then  the  N*  additional 
regression  sums  of  squares,  SSN_(N.1)  =  SSj,  which  are  due  to  each  of 
the  Nm  IV's  contained  in  the  last  group,  are  computed.  Their  values 
are  obtained  by  computing  [bvl }  ]2/cyi  5 ,  (see  Hader  and  Grandage  [1958J, 
p.  126),  where  the  b^1  )  are  the  regression  coefficients  of  the  NM  IV's 
in  the  last  group,  and  the  Cvv5  are  the  corresponding  main  diagonal 
elements  of  the  inverse  matrix.  Of  these  N*  SSX  values  the  minimum  is 
found  and  the  IV  whose  deletion  led  to  it  is  denoted  as  xf*].  Accordingly, 
this  independent  variable  is  ranked  as  the  least  important  one  in  the 
last  group.  Notice  that  this  IV  which  was  ranked  first,  as  the  least 
important  one,  received  the  subscript  "(1)."  In  IVOR  it  was  the  mast 
important  IV  which  received  the  subscript  "(1)."  This  convention  is 
correspondingly  applied  in  the  following  steps  of  BIVOR. 

Second  Step.  The  IV  found  least  important  in  the  first 
step,  X("j,  is  deleted  from  the  model  and  the  matrix  of  the  normal 
equations  corresponding  to  the  N-i  IV's  remaining  in  the  model  is 
inverted. 


In  order  to  find  the  minimum  of  the  N*-l  values 
SSn_(*.a)  «  SS8,  due  to  the  least  important  IV  found  in  the  first 
step  plus  any  one  of  the  N*-l  IV's  not  yet  ranked  in  the  last  group, 
the  following  relation  is  used.  By  the 
regression  sums  af  squares  one  has  SSa 
to  the  least  important  IV  in  the  last  group, 
additional  regression  sum  of  squares  (after  Xi 
model)  due  to  any  one  of  the  NM-1  IV's  not  yet 


additivity  property  of  additional 
-  SS\l)  «•  SS?3),  where  SS^'is  due 
and  SSi  '  is  the 
deleted  from  the 
ranked  in  the  last  group. 


'  1*  SSj 
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Since  SS*1^  is  a  constant  in  the  search  for  the  minimum  of  SSa ,  only 
the  Nm-1  SSj2^  values  need  be  searched  for  the  minimum.  These  values 
are  obtained  in  the  program  by  computing  the  terms  [by2 J ]s/cy y } »  where 
the  by 2 ^  are  the  regression  coefficients  (at  the  second  step)  of  the 
Nm-1  IV's  and  the  Cy2)  are  the  corresponding  main  diagonal  elements 
of  the  inverse  matrix.  Of  these  N„-l  values  the  minimum  is  found  and 
the  IV  whose  deletion  led  to  it  is  denoted  as  x(a}.  Accordingly,  this 
IV  is  ranked  as  the  next-to-lea3t  important  one  in  the  last  group. 


Step  3  to  Step  N* .  The  procedure  is  continued,  corresponding 
to  the  first  two  steps,  until  x^-i)  is  found  in  Step  N„-l.  In  Step 
NM,  the  remaining  IV  in  the  last  group  is,  naturally,  considered  to  be 
the  most  important  one  and  is  denoted  as  x^V)* 


The  remaining  N-N*  steps  of  BIVOR  are  as  follows: 


Step  Nm+1.  All  N-N„  IV 's  are  included  in  the  model  and  the 
corresponding  matrix  of  the  normal  equations  is  inverted.  The  minimum 
of  the  additional  regression  sums  of  squares,  SSK_(  )  =  SS^+i, 

is  found  by  searching  for  the  minimum  of  the  values 

[byN*+1 * ]3/cyy*+1 * .  Here,  the  by*v+1)  are  the  regression  coefficients 
(at  Step  Nm+1)  of  each  of  the  Nk.j  IV's  of  Group  M-l  and  the  CyyK+1) 
are  the  corresponding  main  diagonal  elements  of  the  inverse  matrix. 

The  IV  whose  deletion  (from  Group  M-l)  led  to  the  minimum  is  denoted 
as  x[i^1)  and  is  ranked  as  the  least  important  one  in  Group  M-l. 


Steps  (Nm+2)  to  (Nm+Nm.i)  follow  correspondingly. 

The  procedure  is  continued  through  the  remaining  M-2  groups 
until  all  independent  variables  in  all  groups  have  been  ranked. 


The  additional  regression  sums  of  squares  as  computed  in 
BIVOR  deserve  some  more  discussion.  The  quantity 


Dy 

c  V  V 

equals  the  familiar  numerator  in  the  F  statistic  to  test  the  hypothesis 
Bv  =  0  in  c  model  containing,  sa>^  N'  IV's: 


(VI-17) 


In  other  words,  the  quantities  b£/cvv  used  in  BIVOR  to  find  the  least 
important  IV  in  a  given  group  at  a  given  step  (with  a  model  containing 
N'  IV's),  are  equal  to  the  quantities  used  to  test,  in  the  familiar 
manner  and  one  at  a  time,  the  significance  of  the  N'  regression 
coefficients.  However,  because  of  the  correlations  that  generally 
exist  amonf  all  the  N'  IV's,  one  would  not  obtain  a  meaningful  ordering 
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of  IV 's  if  the  F  values  (VI-17)  of  the  IV's  were  computed  and  ranked 
according  to  their  magnitudes.  Therefore,  at  a  given  step  of  BIVOR, 
the  least  significant  of  these  quantities  is  selected  and  the  corresponding 
IV  is  deleted  from  the  model,  whereupon  at  the  next  step,  again  the 
smallest  of  the  bv/cvv  quantities  is  found  and  again  the  corresponding 
IV  is  deleted  from  the  model,  and  so  on.  This  process  then  leads  to 
the  BIVOR  ranking  of  independent  variables  by  prediction  power  for  the 
dependent  variable,  as  described. 


Because  of  the  possible  existence  of  compounds  (see 
Section  III. 2)  the  minimum  values  of  the  bv/cvv  quantities  can  vary 
considerably  from  one  step  of  BIVOR  to  the  next.  In  fact,  once  a 
significant,  model  has  been  found  based  on  the  BIVOR  ordering  and  on 
the  main  theorem  F  value,  (III-l),  independent  variables  ranked  as 
"more  important"  could  very  well  have  bj/cvv  values  which  are  much 
smaller  than  the  one  corresponding,  for  example,  to  the  "least 
important"  IV  of  the  significant  model.  This  would  appear  as  if  less 
significant  IV's  were  ranked  as  being  more  important  than  the  more 
significant  IV's.  However,  this  conclusion  is  wrong,  and  the  right 
conclusion  should  be  that  a  compound  is  present. 


As  in  IVOR,  the  grouping  feature  in  BIVOR  can  be  used  as 
a  means  to  rank  IV's  under  restricted  admissibility.  This  grouping  is 
done  in  much  the  same  way  as  was  discussed  in  the  last  section  (Vl.l.d) 
and  has  the  same  possible  consequences  with  respect  to  "subranking"  as 
were  mentioned  there. 


In  BIVOR,  the  grouping  option  is,  besides  its  application 
to  ranking  under  restricted  admissibility,  the  only  device  available 
to  save  computing  time.  The  fact  that  not  all  N  IV's  of  the  precon¬ 
ceived  model  need  be  included  in  the  grouping  makes  the  time  saving 
possible.  With 


M 

~  N,  <  N 
q«l 

the  last 

M 

N  -  Z  N, 

q*l 

independent  variables  will  be  excluded  from  the  BIVOR  ordering. 
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VI. 2  Computational  Details 

In  thi j  section  the  computational  details  of  DA-MRCA  are 
described  for  one  regression  problem*  The  intention  is  not  to  give  a 
description  of  the  details  contained  in  the  flow  charts  (Section  VIII. 2) 
or  in  the  prog*. am  listing  (Section  VIII. 4) ,  but  rather  to  describe  the 
more  import ant  computations  and  decisions  made  by  the  program,  inasmuch 
as  they  are  not  discussed  in  previous  sections.  Also,  justifications 
are  g’v  n  for  some  of  these  details  where  considered  to  be  helpful  in 
unders taading  the  program  Along  with  the  description,  all  possible 
statements  am' quoted  Wi.cch  may  result  from  computational  decisions  and 
appear  as  pr* -.tout .  Whenever  mention  is  made  that  the  "program  stops", 
this  refers  tc  the  one  regression  problem  being  processed,  if  not 
otherwise  stated.  In  this  case,  should  there  be  more  than  one  regression 
problem  to  be  processed  by  DA-+1RCA,  the  program  would  go  to  the  next 
problem. 

Generally,  the  order  in  which  the  computational  details  are 
described  is  the  order  in  which  they  are  performed  by  the  program.  In 
some  places  this  order  is  not  kept  for  the  purpose  of  a  better  under¬ 
standing  of  the  description. 

References  to  subroutine  names  are  not  made  since  in  some 
insta  ;ea  the  same  type  of  computation  is  executed,  at  different  places, 
by  different  subroutines.  The  interested  reader  is  referred  to  the  flow 
enacts  in  Section  VIII. 2. 

The  computation  and  use  of  the  "Analysis  of  Variance  Tables" 
and  of  the  "Final  Comprehensive  Analysis  Table"  are  not  discussed  in 
this  section.  This  is  done  only  in  Section  VI. 3. b. 


VI. 2. a  Main  Run 

In  this  section  the  computat ional  details  of  the  main  run 
are  given.  However,  most  of  these  computations  are  correspondingly 
performed  for  any  rerun.  (See  Sections  VI. 2. b  -  VI. 2. d.) 

VI,2.a.(l)  Initial  Operations 

The  operations  described  in  this  section  are  performed 
only  once  per  regression  problem,  i.e.,  they  are  performed  for  the  main 
run  but  are  not  repeated  if  reruns  are  included  in  the  regression 
problem. 


A.  If  the  total  number,  IR+IS=N,  of  independent 
variables  input  Is  0  or  -  51,  the  piogram  stops  and  the  statement 
"CARD  TYPE  2  IS  INCORRECT"  is  printed.  Otherwise  (0  <  N  51)  the 
program  continues. 
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B.  If  the  number,  n,  of  data  points  input  is  5  I 
or  >  7000,  the  program  stops  and  the  statement  "TOO  FEW  OR  TOO  M^NY 

DATA  POINTS'5  is  printed.  Otherwise  (1  <  n  7000)  the  program  continues. 

C.  The  summation  matrix  is  computed.  However,  only 
the  elements  of  the  main  diagonal  and  those  above  the  main  diagonal 
are  actually  computed.  Since  the  summation  matrix  is  symmetrical,  the 
elements  below  the  main  diagonal  are  merely  copied  from  those  above  the 
diagonal. 

VI. 2. a. (2)  Matrix  Inversion  and  Accuracy  Checks 

The  operations  described  in  the  following  paragraphs 
A  -  1  are  performed  for  the  main  run  and,  in  general,  for  any  rerun. 

The  computations  are  expressed  in  terms  of  K  independent  variables 
contained  in  the  model,  where  K=N  defines  the  main  run  and  K=N'  <  N 
defines  a  rerun  with  N'  IV 's  contained  in  the  model. 

A.  The.  inverse  of  the  (K+l)x(K+l)  matrix  A,  i.e., 
the  inverse,  A'"1 ,  of  the  matrix  of  the  normal  equations,  is  computed. 

(The  computational  procedures  involved  in  the  matrix  inversion,  the 
computation  of  the  determinant  and  the  solution  of  the  normal  equations 
are  explained  in  detail  in  Section  VI.  1. a.) 

B.  The  determinant.  u±  A  is.  tested  and  if  found  to  be 
non-positive,  the  statement  "MATRIX  FAILED  TO  INVERT"  is  printed.  For 
this  case,  and  in  the  main  run  only,  the  averages  of  the  N  IV 's  and  of 
the  dependent  variable  are  computed  and  printed  and  the  program  goes 

to  reruns  (if  any).  Also,  if  the  determinant  is  non-positive  for  the 
main  run,  there  will  be  no -final  comprehensive  analysis  for  any  type  of 
reruns  (HAND  selected,  IVOR,  or  BIVOR),  and  the  following  statement  is 
made  at  the  end  of  the  printout  of  the  regression  problem:  "NO  FINAL 
COMPREHENSIVE  PRINTOUT  SINCE  MATRIX  FOR  MAIN  RUN  COULD  NOT  BE  INVERTED." 

-  In  case  of  a  hand  selected  rerun,  the  program  goes  to  the  next  hand 
selected  rerun  (if  any).  In  case  of  an  IVOR  or  BIVOR  rerun,  see 
Sections  VI. 2. c  or  VI. 2. d,  respectively.  -  If  the  determinant  is  positive, 
its  value  is  printed,  along  with  the  inverse  matrix  and  the  solution  to 
the  normal  equations  (regression  coefficients) . 

C.  The  following  values  are  computed: 

(a)  The  error  sum  of  squares, 

K 

SSE  ■»  Evv  ■  ).  bvEyy 

V— 0 
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n  n 

.3 


(where  Eyy  =  I  yf  and  EVy  =  E  xvly{,  with  Xot  =  1) ; 
i=l  i=l 


^  (b)  The  total  sum  of  squares  adjusted  for  the 

mean,  ATSS  =  Eyy  ^  *n  ^*oy* 

(c)  The  regression  sum  of  squares  (due  to  K 
independent  variables)  adjusted  for  the  mean, 


ASSRk 


K 

£  bvEVy 

v=0 


2  . 
Oy  > 


(d)  The  square  of  the  correlation  coefficient, 
i.e,,  the  coefficient  of  determination, 

r2  _  AS SR* 

ATSS 

D.  R2  is  tested  and  if  found  to  be  negative,  the 
statement  "SQUARE  OF  CORRELATION  COEFFICIENT  IS  NEGATIVE"  is  printed. 

For  this  case,  and  in  the  main  run  only,  the  operations  concerning  the 
averages  and  the  final  comprehensive  analysis  are  performed  as 
described  in  paragraph  B  above.  -  In  case  of  a  hand  selected  rerun, 
the  program  goes  to  the  next  one  (if  any) .  In  case  of  IVOR  or  BIVOR, 
see  Section  VI. 2. c  or  VI.2.d,  respectively.  ■-  If  R2-";  0,  the  correlation 
coefficient  (R)  is  computed  and  printed. 

E.  The  residual  variance  (s2)  is  computed  by 
dividing  SSE  by  n-K-1.  The  residual  variance  is  then  tested  and  if 
found  to  he  negative,  the  statement  "VARIANCE  IS  NEGATIVE"  is  printed. 
For  thi  -  case,  and  in  the  main  run  only,  the  operations  concerning  the 
averages  and  the  final  comprehensive  analysis  are  performed  as  described 
in  paragraph  B  above.  -  In  case  of  a  hand  selected  rerun,  the  program 
goes  to  the  next  one  (if  any) .  In  case  of  IVOR  or  BIVOR,  see  Section 
VI. 2. c  or  VI. 2  d,  respectively.  -  If  s2  is  found  to  be  non-negative, 

the  square  root  of  the  residual  variance  (s)  is  computed  and  printed. 

(If  the  quantity  n-K-1-0,  s  is  set  equal  to  zero  and  the  F  value  of 
the  ANOVA  table  is  printed  as  all  nines.  This  is  the  case  of  the 
"zero  error  perfect  fit.") 


F.  The  elements  of  the  main  diagonal  of  the  inverse 
matrix  (the  cVv)  are  tested.  The  first  element  found  to  be  negative 
(if  any)  results  in  the  statement  "AN  ELEMENT  OF  THE  MAIN  DIAGONAL  OF 
THE  INVERSE  MATRIX  IS  NEGATIVE,"  For  this  case,  and  in  the  main  run 
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only,  the  operations  con  erning  the  averages  and  the  final  comprehensive 
analysis  are  performed  as  described  in  paragraph  B  above.  In  case  of 
a  hand  selected  rer  in,  the  program  goes  to  the  next  one  (if  any).  In 
case  of  IVOR  or  BIVOR,  see  Section  VI. 2. c  or  VI. 2. d,  respectively.  - 
If  there  are  no  negative  elements  on  the  main  diagonal,  the  standard 
deviations  of  the  regression  coefficients  are  computed: 

/  V[bv  ]  =  s  /cvv,  where  v  =  0,1, ...  ,K. 

G.  The  elements  of  the  calculated  identity  matrix 
(Ic),  the  ivv»  (v,v'  =  0,1,..., K)  are  obtained  by  forming  the  product 

of  the  inverse  matrix  (A-1)  and  the  matrix  of  the  normal  equations  (A), 
in  this  order.  The  identity  matrix  is  used  for  checking  the  accuracy 
of  the  inversion  process.  The  specifics  of  this  use  and  their  justi¬ 
fications  are  discussed  in  Section  Vl.l.b. 

H.  The  absolute  values  of  the  deviations  from  1  of 
the  main  diagonal  elements  of  Ic  are  tested  against  1(1).  The  first 
deviation  found  to  be  >  1(1)  (if  any)  is  tested  to  determine  if  it  is 
also  £  1(2).  If  it  is,  the  identity  matrix  is  printed  with  the  state¬ 
ment  "DEVIATION  OF  A  MAIN  DIAGONAL  ELEMENT  IN  THE  IDENTITY  MATRIX 

LARGER  THAN  1(2)  =  _  RUN  REJECTED."  In  the  blank  the  input  value 

of  1(2)  is  printed.  In  this  case  the  program  goes  directly  to  the 
operations  described  in  Section  VI. 2. a. (3).  If  the  first  deviation 
which  is  ::  1(1)  is  not  £  1(2),  the  testing  is  continued  on  the 
remaining  diagonal  elements.  If  any  of  the  deviations  of  the  main 
diagonal  elements  are  £  1(1)  but  none  of  these  deviations  is  a  1(2), 
the  identity  matrix  is  printed  with  the  statement  "DEVIATION  OF  A 
MAIN  DIAGONAL  ELEMENT  IN  THE  IDENTITY  MATRIX  LARGER  THAN  1(1)  =  .... 

BUT  LESS  THAN  1(2)  =  _  RUN  ACCEPTED."  In  the  blanks  the  input 

values  of  1(1)  and  1(2)  are  printed. 

I.  If  all  deviations  (absolute)  of  the  main 
diagonal  elements  are  <  1(1),  the  absolute  values  of  the  off-diagonal 
elements  are  tested.  The  first  time  that  an  off-diagonal  element 
(absolute)  is  >  1(1),  the  identity  matrix  is  printed  with  the  statement 
"DEVIATIONS  OF  ALL  MAIN  DIAGONAL  ELEMENTS  IN  THE  IDENTITY  MATRIX  SMALLER 

THAN  I(l)» _ DEVIATION  OF  AN  OFF-DIAGONAL  ELEMENT  LARGER  THAN  1(1). 

RUN  ACCEPTED."  If  ail  off-diagonal  elements  also  have  absolute  values 

1(1),  the  identity  matrix  is  not  printed,  but  the  statement  "DEVIATIONS 
OF  ALL  ELEMENTS  OF  THE  IDENTITY  MATRIX  SMALLER  THAN  1(1)  =  ....  RUN 
ACCEPTED"  is  printed. 
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VI.  2. a.  (3)  Predicted  Values,  Prediction  Errors, 

Normality  Test,  and  Averages 

The  following  operations  A  -  I  are  always  performed 
for  the  main  run  and  are  optionally  performed  for  reruns.  As  in  the 
last  section,  the  operations  are  expressed  in  terms  of  K  independent 
variables  contained  in  the  model. 

A.  The  n  predicted  values  (the  ${)  are  computed  by 
evaluating  the  obtained  regression  equation  for  each  of  the  n  input 
design  points . 


B.  The  prediction  errors,  ej  =  y.  -  Y- ,  are  computed 
for  each  input  design  point  by  subtracting  the  predicted  value  from  the 
observed  value  of  the  dependent  variable.  The  normality  Lest  described 
later  in  this  section  is  performed  on  these  prediction  errors.  Some 
general  aspects  of  the  test  are  discussed  in  Section  VI.l.c. 

C.  The  sum  of  squares  of  the  prediction  errors, 

£  (yi  -  M3> 

i-1 

is  computed.  This  sum  of  squares  should  equal  the  error  sum  of  squares, 
SSE,  given  in  Section  VI. 2. a. (2),  paragraph  C,  and  is  identified,  when 
printed,  as  the  "CHECK  ERROR  SUM  OF  SQUARES."  The  check  error  sum  of 
squares  is  computed  as  an  additional  check  on  the  computat ional  accuracy. 
Since  the  values  e.  =  y.  -  Y«  are  already  computed,  this  check  is 
inexpensive.  However,  no  sensing  is  built  into  the  program  to  compare 
the  two  error  sums  of  squares . 

D.  The  maximum  and  minimum  of  the  n  prediction  errors 
are  found  and  the  range  (=  the  maximum  prediction  error  minus  the. 
minimum  prediction  error)  is  computed.  The  range  is  then  divided  by 

30  to  give  the  common  length  (D)  of  the  30  intervals  used  in  the 
prediction  error  frequency  distribution.  The  upper  bounds  of  each  of 
the  30  intervals  are  computed  by  adding  D,  2D,  3D,  .  ,,  30D,  respectively, 
to  the  minimum  prediction  error.  Thereby,  the  maximum  prediction  error 
becomes  the  upper  bound  of  the  last  interval. 

Each  prediction  error  is  then  assigned  to  its 
proper  interval,  i.e.,  to  the  interval  with  the  smallest  upper  bound 
which  is  not  exceeded  by  the  prediction  error.  A  count,  is  then  made 
of  the  number  (f'»)  of  prediction  errors  observed  in  each  of  the  30 
intervals.  The  f V  are  used  in  the  bar  chart  of  the  printout,  see 
the  following  paragraph  (E) . 

E.  The  quantity  ^  -  (K+3)  is  computed  and  checked. 

If  this  quantity  is  0,  the  bar  chart  is  printed,  along  with  the 
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statement  "CHI  SQUARE  COULD  NOT  BE  COMPUTED."  For  this  case,  and  in 
the  main  run  only,  the  program  goes  to  the  operations  described  in 
paragraph  I  below.  -  In  case  of  a  hand  selected  rerun,  the  program 
goes  to  the  operations  described  in  Section  VI. 2. a. (4),  should  the 
option  for  selected  and/or  synthetic  design  points  be  exercised. 

This  check  is  a  joint  consequence  of  (1),  the  restriction  that  cpj , 
the  expected  number  of  observations  in  an  interval,  should  be  greater 
than  5  and  (2) ,  the  definition  of  the  degrees  of  freedom  for  the 
Chi-square  statistic  as  the  number  of  intervals,  for  which  cpj  >  5, 
minus  K+3.  The  circumflex  on  9j  is  used  to  express  the  fact  that 
these  expected  frequencies  are  based  on  the  estimates  of  the  mean  and 
the  standard  deviation  of  the  distribution  of  the  prediction  errors. 

If  the  quantity  ^  -  (K+3)  is  ^  0,  the  degrees  of  freedom  for  Chi-square 
could  never  be  >J0  and  further  computations  would  be  meaningless.  The 
restriction  on  and  the  degrees  of  freedom  for  Chi-square  are  more 
fully  discussed  in  the  following  paragraph  F. 

F.  If  the  quantity  5  -  (K+3)  is  >  0,  an  attempt 

is  made  to  compute  the  Chi-square  statistic.  The  expected  frequency 
distribution  is  formed.  This  distribution  gives  the  number  of 
prediction  errors  that  would  be  expected  in  each  of  the  30  intervals 
if  the  sample  of  n  prediction  errors  was  actually  from  a  normal 
distribution  having  a  mean  and  standard  deviation  equal  to  those  of 
the  observed  prediction  errors.  Since  the  expected  frequency  in  each 
interval  is  computed  by  a  system  subroutine  which  uses  the  standardized 
normal  distribution  function,  the  30  upper  bounds  must  be  standardized 
by  dividing  each  upper  bound  by  s .  (The  average  of  the  observed 
prediction  errors  is  zero  and,  consequently,  is  not  subtracted  in 
standardizing  the  upper  bound.)  The  expected  frequency  in  each  of  the 
30  intervals  is  obtained  by  multiplying  the  number  of  data  points,  n, 
by  the  probability,  obtained  from  the  standard  normal  tables,  that  an 
observation  will  be  in  a  given  interval.  The  expected  frequencies  in 
each  of  the  30  intervals  are  then  examined  and,  if  necessary,  some  of 
the  intervals  are  combined  in  order  that  each  of  the  resulting  m 
intervals  has  an  expected  frequency  of  more  than  5.  If,  for  example, 
the  expected  frequency  in  the  first  of  the  30  intervals  is  <  5,  the 
frequency  is  added  to  that  of  the  next  interval.  This  procedure  is 
continued  until  the  first  time  a  new  interval  results  which  does  have 
an  expected  frequency  of  more  than  5.  Succeeding  intervals  are 
similarly  tested  and,  if  necessary,  combined.  If  the  last  interval, 

or  intervals,  does  not  have  an  expected  frequency  of  more  than  5,  it 
is  combined  with  the  last  interval  which  did  have  a  frequency  of  more 
than  5.  In  this  way  m  "new"  intervals  are  formed,  each  of  which  has 
an  expected  frequency,  cp, ,  greater  than  5. 

G.  The  number  ( f j )  of  observed  prediction  errors 
is  counted  for  each  of  the  m  intervals  and  the  contribution  to 
Chi-square  is  computed  for  each  interval.  The  contribution  for  the 
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jth  interval  is 


1 

A 


where  f3  and  cp3  are  as  defined  above.  These  contributions  to  Chi-square 
are  then  printed  for  each  of  the  m  intervals,  along  with  the  observed 
and  expected  number  of  observations  in  that  interval. 


H.  The  quantity  m-K-3  is  computed.  If  m-K-3  £  0, 
the  statement  "CHI  SQUARE  COULD  NOT  BE  COMPUTED"  is  printed.  In  this 
case  the  program  continues  as  described  in  paragraph  E  above. 


If  m-K-3  >  0,  the  Chi-square  statistic  is  computed 
by  summing  the  individual  contributions  over  the  m  intervals. 


I.  Only  in  the  main  run  are  the  averages  of  the  N 
independent  variables  and  of  the  dependent  variable  computed  and 
printed . 


VI. 2. a. (4)  Predicted  Values  and  Prediction  Standard 
Deviations  at  Selected  Input  and/or  Synthetic  Design 
Points 


If  the  run  (main  run  or  hand  selected  rerun)  passed 
all  tests  in  paragraphs  B,  D,  E,  and  F  of  Section  VI. 2. a. (2),  and  if 
selected  input  and/or  synthetic  design  points  are  present  (see  columns 
8-13,  Card  Type  2,  Section  V.2)^  the  coordinates  of  the  OCIV's  of  these 
points  are  printed  and  the  corresponding  predicted  values  and  prediction 
standard  deviations  for  either  individual  observations  or  for  the 
prediction  line  are  computed  and  printed.  If  the  run  did  not  pass  the 
four  tests  mentioned  above,  predicted  values  and  prediction  standard 
deviations  cannot  be  obtained  for  either  selected  input  or  synthetic 
design  points. 


VI. 2. b  Hand  Selected  Reruns 

In  order  to  execute  a  hand  selected  rerun  (if  any  are 
specified)  the  program  deletes  the  proper  rows  and  columns  from  the 
summation  matrix  according  to  the  specified  independent  variable 
selection  of  R  *  S'  <  N  IV’s.  The  operations  described  in  Section 
VI. 2. a. (2)  are  then  performed  for  this  IVS  (with  the  exceptions 
mentioned  there).  If  NPE=L  (column  15,  Card  Type  2),  the  operations 
of  paragraph  A  -  H  of  Section  VI. 2. a. (3)  are  also  performed  for  this 
IVS. 


Predictions  and  prediction  standard  deviations  for 
selected  input  and/or  synthetic  design  points  are  computed  only  when 
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the  option  is  exercised  and  when  the  hand  selected  IVS  passed  all 
tests  described  in  paragraphs  B,  D,  E,  and  F  of  Section  VI. 2. a. (2). 

VI. 2. c  IVOR 


In  this  section  the  computational  details  which  are 
performed  to  arrive  at  an  IVOR  ordering  of  independent  variables  are 
described.  (The  IVOR  ordering  is  explained  in  Section  Vl.l.d.) 

If 

MX 

ZN]  =  N, 

j=l 

only  the  first  N-l  steps  of  IVOR  are  performed  since  the  main  run  has 
already  been  performed.  There  is  no  possibility  in  IVOR  to  call,  in 
each  IVOR  rerun,  for  predictions  and  prediction  standard  deviations  at 
selected  input  and/or  synthetic  design  points.  As  indicated  before,  if 
the  main  run  fails  any  of  the  tests  performed  on  the  determinant,  Ra, 
s2,  and  the  cVv 's  (as  described  in  paragraphs  B,  D,  E,  and  F  of 
Section  VI. 2. a. (2)),  there  will  be  no  IVOR  Final  Comprehensive  Analysis. 

At  any  given  step  of  IVOR  (where  "step"  is  as  defined  in 
Section  Vl.l.d)  the  following  operations  are  performed: 

A.  The  established  IVOR  model  of  the  preceding  step  is 
augmented  by  one  independent  variable  at  a  time.  There  may  be  left, 
say,  H  IV's  not  yet  ordered  within  the  group  in  which  IVOR  is  presently 
operating.  Each  of  the  H  IV's  is  added,  one  at  a  time,  to  the  IVOR 
model  of  the  preceding  step  by  deleting  one  less  row  and  column  from 
the  summation  matrix  than  in  the  previous  step.  Each  of  the  H  corre¬ 
sponding  matrices  of  the  normal  equations  (A)  is  then  inverted  and  its 
determinant  computed. 

B.  The  procedure  to  decide  whether  or  not  to  accept  any 
of  the  H  independent  variable  selections  for  further  consideration  at 
this  step  depends  upon  whether  the  main  run  was  accepted  or  rejected. 
("Acceptance" is  defined  as  passing  all  5  tests  described  in  paragraphs 
B,  D,  E,  F,  and  H  of  Section  VI. 2. a. (2).  "Rejection"  is  defined  as 
failing  one  or  more  of  these  tests.) 

(B.a)  If  the  main  run  was  accepted:  The  determinant 
is  checked  for  each  of  the  H  IVS's  and  if  found  to  be  non-positive, 
this  IVS  is  excluded  from  further  consideration  at  this  step.  For 
all  IVS's  with  non-positive  determinants  the  statement  "MATRIX  FAILED 

TO  INVERT,  IVS  . "  is  printed,  where  the  blank  is  filled  by  the 

identification  of  the  IVS.  For  all  IVS's  whose  determinant  is  found 
to  be  positive  the  ASSR  value  is  computed.  Should  all  H  determinants 
be  non-positive  the  statement  "NO  VALID  ASSR' S  WERE  COMPUTED"  is 
printed  and  the  IVOR  ordering  is  terminated. 
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(B.b)  If  the  main  run  was  rejected:  R2  and  s2  are 
computed  for  each  one  of  the  H  IVS's.  R2  and  s2  are  then  tested  to 
determine  if  either  of  them  is  negative,  and  the  determinant  is  tested 
to  determine  if  it  is  non  positive .  If  a  failure  occurs,  the  statements 
concerning  the  determinant,  Ra,  and  s2  as  given  in  paragraphs  B,  D,  and 
E  of  Section  VI. 2. a. (2)  are  printed  along  with  the  IVS  identification. 
These  IVS's  are  excluded  from  further  consideration  at  this  step.  Then 
the  operations  described  in  paragraphs  F,  6,  H,  and  I  of  Section  VI. 2. a. (2) 
are  performed  for  each  one  of  the  H  or  the  remaining  IVS's.  If  for  a 
given  IVS  an  element  of  the  main  diagonal  of  the  inverse  matrix  is  found 
to  be  negative,  the  appropriate  statement  is  printed  and  this  IVS  is 
excluded  from  further  consideration  at  this  step.  If  an  IVS  has  to  be 
excluded  from  further  consideration  because  an  element  of  the  main 
diagonal  of  the  identity  matrix  has  an  absolute  deviation  from  1 
greater  than  1(2),  the  appropriate  statement  is  printed  together  with 
the  identification  of  the  IVS .  (The  other  possible  statements  concerning 
the  elements  of  the  identity  matrix  are  printed  only  when  the  IVS  is 
later  chosen  as  the  established  IVOR  model  of  this  step.)  If  none  of 
the  H  IVS's  could  be  accepted,  IVOR  stops  and  prints  "NO  VALID  ASSR'S 
WERE  COMPUTED." 


C.  If,  in  either  case  of  paragraph  B  (above),  only  one 
IVS  of  the  H  considered  led  to  a  valid  ASSR  value,  this  IVS  represents 
the  established  IVOR  model  at  this  step.  In  other  words,  the  individual 
IV  whose  inclusion  led  to  the  only  valid  ASSR  value  is  ordered  as  the 
independent  variable  with  the  maximum  contribution  to  the  "total" 
regression  sum  of  squares  at  this  step.  For  this  IVS,  all  pertinent 
printouts  are  given.  Also  computed  and  printed  for  this  IVS,  provided 
the  option  is  exercised  for  reruns,  are  the  predicted  values,  the 
prediction  errors  and  the  normality  test  as  described  in  Section  VI. 2. a. (3). 
IVOR  then  goes  to  the  next  step  (if  there  is  any). 


D.  If  more  than  one  IVS  in  paragraph  B  (above)  led  to  a 
valid  ASSR  value,  these  values  are  compared  among  themselves  as  follows. 
The  valid  ASSR  value  corresponding  to  the  IVS  with  the  leftmost  IV 
added  to  the  model  of  the  preceding  step  is  denoted  as  ASSR( 1  * .  Then 
for  each  of  the  remaining  valid  ASSR  values  (the  ASSR(1),s,  say)  the 
following  quantities  are  computed: 


A* 


ASSR' 1 )  -  ASSR* 1 > 
ASSRU) 


(D.a)  If  none  of  the  quantities  exceeds  the  fixed 
value  .5  x  10" 8 ,  ail  of  the  ASSR's  are  considered  to  be  equal  and  a 
"perfect  fit"  is  considered  to  have  been  reached.  (When  a  perfect 
fit  is  being  reached,  each  IV  contributes  the  same  additional  regression 
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sum  of  squares  towards  the  ASSR  value  of  this  perfect  fit.)  The  left¬ 
most  ZV  is  then  defined  as  the  most  important  IV  ordered  at  this  step, 
and  a  complete  printout  (as  discussed  in  paragraph  C  above)  is  given 
for  the  corresponding  IVS,  along  with  the  statement  "PERFECT  FIT. 

IVS  =  . "  The  IVOR  subroutine  then  stops  completely. 

(D.b)  If  one  or  more  of  the  quantities  Aj  exceeds 
the  value  .5  x  10~8,  the  maximum  ASSR  value  is  found  and  the  IV  which 
led  to  the  maximum  is  considered  as  the  most  important  IV  at  this 
step.  A  complete  printout  (as  in  paragraph  C  above)  is  given  for  the 
corresponding  IVS,  and  the  IVOR  subroutine  goes  to  the  next  step  (if 
there  is  any) . 


VI. 2. d  BIVOR 


In  this  section  the  computational  details  which  are 
performed  to  arrive  at  a  BIVOR  ordering  of  independent  variables  are 
described.  (The  BIVOR  ordering  is  explained  in  Section  Vl.l.e.) 

Mg  Me 

If  EN,  <  N,  BIVOR  deletes  the  last  (N  -  I  N,) 
q=l  q=l 

independent  variables  from  the  model  of  the  main  run  by  deleting  the 
corresponding  rows  and  columns  from  the  summation  matrix.  BIVOR  then 

starts  the  ordering  by  inverting  the  matrix  with  Z  N,  independent 


variables  contained  in  the  model. 


q-l 


There  is  no  possibility  in  BIVOR  to  caK,  in  each  BIVOR 
rerun,  for  predictions  and  prediction  standard  deviations  at  selected 
input  and/or  synthetic  design  points.  As  indicated  before,  if  the  main 
run  failed  any  of  the  tests  perfotmed  on  the  determinant,  R#,  s8»  and 
the  CyV 's  (as  described  in  paragraphs  B,  0,  E,  and  F  of  Section 
VI.2.a.(2»,  there  will  be  no  BIVOR  Final  Comprehensive  Analysis. 

The  operations  at  any  given  step  of  BIVOR  (where  "step" 
is  ss  defined  in  Section  Vl.l.e)  ere  dependent  upon  whether  or  not  the 
preceding  step  led  to  en  accepted  BIVOR  rerun. 
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A.  If  the  main  run  was  rejected  and  all  preceding 
steps  of  BIVOR  (if  any)  led  to  rejected  reruns,  the  operations  are 
as  follow: 


(A. a)  From  the  BIVOR  model  (which  was  rejected)  of 
the  preceding  step  the  rightmost  IV  is  deleted  by  deleting  the  corre¬ 
sponding  row  and  column  from  the  matrix  of  the  normal  equations  (A) 
of  the  preceding  step.  Then  the  elements  of  the  inverse  matrix  A*1 , 
the  determinant  of  A,  R2,  and  s2  are  computed.  These  values  are 
subjected  to  the  respective  tests  described  in  paragraphs  B,  0,  E, 
and  F  of  Section  VI. 2. a. (2).  If  the  new  IVS  fails  any  of  these  tests, 
again  the  rightmost  IV  is  deleted  from  the  model  for  the  next  step 
and  the  checks  are  repeated  for  the  new  model.  -  If  the  new  IVS 
passes  all  4  tests,  the  operations  of  the  next  paragraph  (A.b)  are 
performed . 


(A.b)  The  identity  matrix,  Is  =  A^A,  is  computed  for 
the  present  step's  IVS  (which  passed  the  four  checks  mentioned  in  the 
last  paragraph) .  Then  the  checks  as  described  in  paragraphs  H  and  I, 
Section  VI. 2. a. (2),  are  performed  on  the  elements  of  Ie .  The  first 
time  a  main  diagonal  element  of  Ie  has  an  absolute  deviation  from  1 
which  is  greater  than  1(2),  the  IVS  of  the  present  step  will  be 
rejected.  However,  in  this  case  this  IVS  will  be  given  a  complete 
printout,  including  the  predicted  values,  prediction  errors  and 
normality  test  (Section  VI. 2. a. (3)).  The  reason  for  this  treatment 
is  that  the  value  of  1(2)  is,  after  all,  an  optional  input  value 
chosen  by  the  program  user  and  that  the  IVS  rejected  on  the  grounds 
of  1(2)  may  be  marginal  in  its  accuracy  but  essentially  acceptable. 

By  having  the  printout  for  this  run,  the  analyst  is  given  additional 
information  as  to  the  possibility  of  reconsidering  the  regression 
problem  with  some  of  the  input  parameters  changed.  There  is,  in  this 
case,  a  certain  danger  of  misinterpretation  of  the  printout.  Although 
at  each  individual  BIVOR  rerun  the  statement  is  printed  that  this  run 
is  rejected,  ir  could  appear,  from  the  final  comprebens ive  analysis 
(if  this  is  printed),  as  if  the  series  of  deletions  from  the  right  was 
a  genuine  BIVOR  ordering  of  independent  variables.  This  will  occur 
most  likely  when  the  value  of  1(2)  was  chosen  too  small.  -Also  ia 
this  case  (of  the  BIVOR  IVS  failing  only  the  I,  test)  the  subroutine 
goes  to  the  next  stop  by  deleting  the  rightmost  IV  from  the  model. 

If  the  IVS  of  the  present  step  is  accepted,  the  operations 
of  the  next  paragraph  are  performed. 

(A.c)  If  the  IVS  of  the  present  step  was  accepted,  i.e», 
passed  all  five  checks  described  in  paragraphs  (A. a)  and  (A.b)  above, 
the  additional  regression  sum  of  squares  (<=  by/cvy)  are  computed  for 
all  i/*e  not  yet  ordered  in  the  gr >up  in  which  BIVOR  is  presently 
operating.  If  there  are  more  than  one  of  these  additional  regression 
sums  of  squares,  the  minimum  is  found  and  the  IV  which  ltd  to  it  is 
ranked  as  the  least  important  one  at  this  step.  Sirce  the  accepted 
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IVS  of  this  step  represents  the  first  accepted  rerun  of  BIVOR,  it  is 
given  the  complete  printout,  including  predictions,  prediction  errors, 
and  the  normality  test.  BIVOR  then  goes  to  the  next  step  (if  any),  as 
described  for  this  case  in  the  next  paragraph  (B) . 

B.  If  the  main  run  and/or  the  IVS  of  any  previous  step 
has  been  accepted,  BIVOR  goes  to  the  next  sLep  by  computing  the 
additional  regression  sums  of  squares  for  all  IV's  which  have  not 
yet  been  ordered  in  the  group  in  which  BIVOR  is  presently  operating. 

The  values  are  compared  and  the  IV  which  led  to  the  minimum  additional 
regression  sum  of  squares  is  deleted  from  the  model.  The  matrix  A  of 
this  new  IVS  is  inverted  and  the  determinant,  R2  ,  s2,  and  I.  are 
computed  and  the  corresponding  tests  are  performed  as  described  in 
paragraphs  B,  D,  E,  F,  H,  and  1  of  Section  VI. 2. a. (2).  (If  the  option 
described  in  paragraph  C  below  is  chosen,  the  tests  on  the  elements  of 
Ie  are  terminated  with  that  rerun  in  which  all  absolute  deviations  of 
the  matrix  elements  are  1(1)  for  the  first  time.)  -  This  BIVOR  rerun 
is  given  a  full  printout,  including  the  predicted  values,  prediction 
errors,  and  normality  test  if  this  option  is  exercised  for  reruns.  The 
BIVOR  ordering  is  terminated  when  an  IVS  arrived  at  contains  only  one 
independent  variable. 

C.  If  the  option  to  discontinue  the  identity  matrix 
checks  in  BIVOR  is  used  (i.e.,  IBID  *  1  on  Card  fype  2),  then  the 
identity  matrix  is  printed  for  the  first  BIVOR  rerun  in  which  the 
absolute  values  of  all  deviations  are  <  1(1),  together  with  the  state¬ 
ment  "DEVIATIONS  OF  ALL  ELEMENTS  OF  THE  IDENTITY  MATRIX  SMALLER  THAN 

1(1)  *  _  Rlh!  ACCEPTED.  NO  IDENTITY  MATRIX  CHECKS  WILL  BE  MADE  ON 

SUBSEQUENT  BIVOR  RUNS."  Accordingly,  for  ensuing  reruns  in  a  BIVOR 
.equence  the  identity  matrix  is  not  computed  and  no  checking  is  done. 
The  purpose  of  this  option  in  BIVOR  1  to  save  computer  time.  Since 
each  subsequent  BIVOR  IVS  contains  only  a  subset  of  the  independent 
variables  contained  in  the  model  of  the  rerun  in  which  the  checking 
ceased,  the  assumption  is  made  that,  in  the  great  majority  of  cases, 
in  ail  subsequent  BIVOR  runs  all  absolute  deviations  of  the  elements 
of  the  identity  matrix  would  be  -<  1(1). 


VI. 3  Printout 

In  this  section  the  general  formulation  of  the  printout  is 
given,  supplemented  by  comments  when  considered  necessary  for  clari¬ 
fication.  (the  comments  are  contained  in  Section  VI. 3. o.) 


VI. 3. a  Formulation  of  Printout 

This  section  contains  the  algebraic  formulation  of  the 
printout  of  DA-MkCA.  The  printout  for  one  regression  problem  is 
divided  into  four  partsi 
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(1)  Basic  Information 

(2)  Main  Body 

(3)  Analysis  of  Variance  Tables 

(4)  Final  Comprehensive  Analysis  Table. 

The  "Basic  Information"  part  is  printed  only  once  per  regression  problem 
and  contains 


(A)  a  printout  of  the  problem  parameters  input  or,  Card 
Type?  1-6, 

(B)  the  data  matrix,  and 

(C)  the  summation  matrix. 

The  second  part,  the  "Main  Body"  pr inrout,  contains 

(A)  all  information  pertaining  to  the  matrix  inversion, 

(B)  various  statistics 

(C)  predicted  values,  prediction  errors,  normality  test, 
and  a/oragts,  and 

(D)  predicted  values  and  prediction  standard  deviations 

at  selected  input  and/or  synthetic  design  points  (optional). 

The  main  body  is  printed  for  the  main  run  and  for  each  rerun,  except  for 
specific  options  which  are  not  called  or  cannot  be  called  for  a  rerun. 

The  third  part  contains  the  "Analysis  of  Variance  Tables"  for  the  main  run 
ai J  for  all  reruns  The  "Final  Comprehensive  Analysis  Table"  is  printed 
as  tne  fourth  and  last  part  and  cortains  information  for  hand  selected 
reruns  and  Ur  IVOR  and/or  BIVOR,  should  any  of  these  options  be  exercised. 
All  wording  which  is  shown  in  capital  letter.,  is  actually  printed  by 
the  program;  all  cvrtnencs  or  general  formulations  printed  in  lower  case 
lotto  s  ad  put  i  parentheses  are  either  not  printed  at  all  by  the 
program  or  not  printed  in  this  form. 

The  comments  on  the  printout  formulation  are  given  in  the 
next  section  (VI. 3. b). 
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VI. 3. b  Comments  on  Printout 


The  comments  in  this  section  refer  to  the  algebraic 
formulation  of  the  printout  as  giver  in  the  previous  section.  The 
page  numbers  referenced  are  the  page  numbers  of  that  printout.  In 
some  instances  the  possible  use  of  the  printed  information  is  discussed 
inasmuch  as  this  has  not  been  done  before. 

VI.3.b.(l)  Basic  Information 

A.  Problem  Parameters  (page  87).  The  page  is 
headed  by  the  problem  identification  as  given  on  Card  Type  1.  This 
identification  is  repeated,  at  the  beginning  of  certain  features, 
throughout  the  program  output  for  ease  in  identifying  the  printout 
of  a  given  regression  problem  when  several  problems  have  been  run 
consecutively.  Page  87  contains  information  given  on  input  Card  Types 
2,  3,  4,  5,  and  6,  and  identifies  the  problem  parameters  chosen  for  the 
regression  problem.  The  columns  occupied  by  the  program  variables  in 
this  printout  do  not  all  agree  with  those  specified  in  the  input  speci¬ 
fication,  Section  V.2.  For  clarity  of  reading,  the  entries  are  spaced 
across  this  page.  The  spaces  filled  by  X's  indicate  digits  are  to  be 
printed.  In  the  Card  Type  3  line,  the  individual  product  term 
descriptions  are  separated  by  slants.  Zeros  are  printed  in  the  spaces 
which  are  not  needed  to  represent  the  product  terms. 

B.  Data  Matrix  (page  83).  The  data  matrix  printout 
is  optional  (see  column  16,  Card  Type  2)  and  can  be  either  in  the 
format  9F13.6  or  7E17.8,  whichever  is  specified  on  Card  Type  2.  The 
data  matrix  is  printed,  if  at  all,  tor  the  main  run  only. 

Each  row  of  the  data  matrix  is  identified  by  its 
"data  point  number"  (i  =  l,2,3,...,n)  and  consists  of  the  N+ 1  coordinates 
of  the  N  independent  variables  and  the  dependent  variable. 

The  coordinates  of  the  OCIV's  are  listed  in  the 
same  order  as  punched  on  Card  Type  8.  If  generated  independent 
variables  (GCIV's)  are  used,  they  follow  the  OCIV's,  and  their  coordi¬ 
nates  are  listed  in  the  same  order  as  generated  according  to  Card  Type  3. 

The  data  matrix  is  printed  only  once  per  regression 
problem  (i.e.,  for  the  main  run)  but  can  easily  be  obtained  for  any 
rerun  by  deleting  the  column,  or  columns,  that  correspond  to  the  inde¬ 
pendent  variable(s)  which  are  deleted  in  Che  rerun. 

C.  Summation  Matrix  (paste  88).  The  summation 
matrix  is  printed  only  once  per  regression  problem;  its  dimensions  are 
N«2  by  N+2,  The  (N*L)x(N+l)  matrix  consisting  of  the  first  N* l  rows 
and  columns  of  the  summation  matrix  is  the  matrix  of  the  coefficients 
of  the  normal  equations  for  the  main  run,  or  the  matrix  A.  Both  the 
matrix  A  and  the  summation  matrix  are  synsnetr  ical . 
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The  summation  matrix  (and  the  matrix  A)  of  any 
rerun  can  easily  be  obtained  by  deleting  the  row(s)  and  column(s),  which 
correspond  to  the  independent  var.iable(s)  to  be  deleted,  from  the 
summation  matrix  of  the  main  run. 


VI .3  .b  .(2)  Main  Body 

The  formulation  of  the  printout  of  the  main  body  is 
done  in  terms  of  K  independent  variables  contained  in  the  model. 
Accordingly,  with  K  =  N  or  K  =  N 1  N  this  formulation  is  valid  for 
the  main  run  or  any  rerun,  respect i”e ly .  Wherever  applicable,  the  K 
independent  variables  contained  in  the  model  are  consecutively  renumbered 
from  1  to  K.  If,  for  example,  the  first  two  independent  variables  of 
the  main  run  are  not  included  in  a  rerun,  then  the  third  IV  of  the  main 
run  becomes  xv’  Number  1  of  the  rerun. 

For  reruns  the  main  body  is  headed  "INDEPENDENT 

VARIABLE  SELECTION  (  )  0 . . "  In  the  parentheses  "HAND," 

or  "IVOR,"  or  "BIVOR,"  whichever  applies,  is  printed.  lor  the  main 
run  there  is  no  identification  printed  at  this  place.  The  IVS  is 
specifically  identified  by  a  series  of  N+l  0’s  and  l‘s,  of  which  the 
first  is  always  a  0.  These  N*1  digits  represent  the  constant  (the 
first  0)  and  the  N  independent  variables,  respectively,  corresponding 
to  their  order  of  input.  If  a  specific  independent  variable  is  con¬ 
tained  in  the  IVS,  a  0  is  printed  in  the  place  corresponding  to  this 
IV;  if  it  is  not  contained  in  the  IVS,  a  1  is  printed.  Thus,  when  IV 
Number  v  (,  -  1,...,N)  is  contained  in  the  IVS,  digit  number  v*l  from 
the  left  in  this  identification  will  be  a  0.  Because  the  constant 
(IV  Number  0)  is  always  contained  in  an  IVS,  the  first  digit  is 
always  printed  as  a  0.  The  IV's  not  contained  in  an  IVS  (which  are, 
accordingly,  represented  by  I's),  are  often  referred  to  as  "deleted" 

IV's,  that  is,  as  IV's  "deleted  from  the  model."  -  The  IVS  identifi¬ 
cation  is  repeated  at  various  other  places  of  the  printout,  when 
appropr late . 

A.  Matrix  Inversion  (pages  89  and  90).  The  MATRIX 
INVERSION  EVALUATION  TIME  inc  ladles  "the  time  required  to  invert  the 
matrix,  compute  the  determinant  and  solve  the  set  of  the  normal 
equations.  The  main  run  is  numbered  0,  the  first  rerun  1,  the  second 
rerun  2,  etc.  The  printouts  of  the  matrix  inversion  evaluation  time 
and  of  other  running  times  were  originally  included  for  a  time  study 
which  resulted  in  the  time  formulae  given  in  Section  VI. 4.  The  running 
time  printouts  have  been  left  in  the  program  as  a  convenience  for  the 
user. 
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The  DETERMINANT  of  the  matrix  A  may  be  printed 
in  the  F  format  or  the  E  format  depending  upon  the  magnitude  of  the 
value  of  the  determinant.  If  the  determinant  is  negative  or  equal  to 
zero  the  statement  "MATRIX  FAILED  TO  INVERT"  is  printed.  (See  Section 
VI .2 .a . (2) o) 

The  elements  of  A-1,  i.e.,  of  the  INVERSE  OF 
MATRIX  A,  are  denoted  as  cvv»  (v  •=  0,1,...,  K;  v'  =  0,1,...,  K)  .  The 
inverse  matrix  should  be  symmetrica],  i.e.,  cvv*  =  cv\i ,  but  is  sometimes 
not  because  of  computational  inaccuracies .  Its  dimensions  are  (K+l)  by 
(K+l). 


For  further  statements  concerning  the  failure  of 
the  matrix  inversion  see  paragraphs  D,  E,  and  F  of  Section  VI. 2. a. (2). 

The  SOLUTION  TO  SIMULTANEOUS  EQUATIONS  is  the 
vector  of  the  K+l  regression  coefficients  bv ,  v  =  0,1,..., K,  with 

K 

by  =  ^  Cyy*  Ey*y  . 

'.to 

The  elements  of  the  calculated  IDENTITY  MATRIX 
(I.)  are  obtained  by  multiplying  the  inverse  matrix  A-"  by  the  matrix  A, 
i.e.,  I.  =  A-1A.  The  dimensions  of  Ie  are  K+l  by  K+l. 

For  possible  printouts  regarding  the  magnitude  of 
the  elements  of  the  calculated  identity  matrix  see  paragraphs  H  and  I 
of  Section  VI. 2. a. (2)  and  Section  VI. 2. d.  When  the  statement  "DEVIATIONS 

OF  ALL  ELEMENTS  OF  THE  IDENTITY  MATRIX  SMALLER  THAN  1(1)  =  .  RUN 

ACCEPTED"  is  made,  the  identity  matrix  is  not  printed. 

B. _ Various  Statistics  (page  90).  The  STANDARD 

DEVIATION  OF  (regression)  COEFFICIENTS, 

v/0l  by  ]  =  S  /Cyy  , 

are  always  consecutively  numbered  as  described  at  the  beginning  of  this 
section  (VI  .3  .b .  (2)) .  No.  1  is  always  the  standard  deviation  of  b0 . 

In  the  main  run,  the  standard  deviation  identified  by  the  number  3,  for 
example,  is  the  standard  deviation  of  the  second  regression  coefficient, 
b:i  .  In  a  rerun,  the  standard  deviation  numbered  2,  for  example,  may  be 
the  standard  deviation  of  the  regression  coefficient  of  IV  No .  3  if  IV's 
No.  1  and  No.  2  (in  the  original  model)  have  been  deleted  for  this  IVS  . 

The  5  other  statistics  are  denoted  elsewhere  in 
the  printout  formulation  and  at  various  places  of  the  report,  as 
follows: 
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RESIDUAL  OR  ERROR  SUM  OF  SQUARES  =  SSE 

TOTAL  SUM  OF  SQUARES  ADJUSTED  FOR  THE  MEAN  =  ATSS 

REGRESSION  SUM  OF  SQUARES  (due  to  K  IV 's)  ADJUSTED  FOR  THE 
MEAN  =  ASSRK 

CORRELATION  COEFFICIENT  =  R 
SQUARE  ROOT  OF  PESIDUAL  VARIANCE  =  s 

Notice  that,  besides  SSE,  ASSRK ,  and  R,  also 
the  standard  deviation,  s,  is  redefined  in  each  run  (with  K  independent 
variables  contained  in  the  model)  and  is  the  basis,  in  that  run,  for 
the  computation  of  the  standard  deviations  of  the  regression  coefficients, 
the  normality  test  and  the  prediction  standard  deviations  at  selected 
input  and/or  synthetic  design  points. 

C.  Predicted  Values.  Prediction  Errors,  Normality 
Test,  and  Averages  (pages  91  and  92) .  For  each  of  the  n  input  design 
points  the  PREDICTED  VALUE  (?t)  is  printed,  and  similarly  the  PREDICTION 
ERROR  (e{)  as  obtained  by  subtracting  the  predicted  value  from  the 
actual  observation  of  y.  The  number  of  the  input  design  point 
is  also  printed  and  is  referred  to,  in  the  heading  of  this  printout,  as 
ITEM  NUMBER, 


The  CHECK  ERROR  SUM  OF  SQUARES, 
n  K 

r  Ty  J  ■  £  byXy  ]2 , 

i=l  v=0 

should  equal  the  Residual  or  Error  Sum  of  Squares  (SSE)  .  Any 
discrepancy  between  the  two  is  an  indication  of  computer  inaccuracy. 

(See  paragraph  C  of  Section  VI. 2. a. (3).) 

The  printout  format  for  the  predicted  values  and 
for  the  prediction  errors  is  affected  by  the  value  of  NDPO  (column  16, 
Card  Type  2).  If  N3PO/1,  these  values  are  printed  in  the  format  2F15.6; 
if  NDP0=1,  they  are  printed  in  the  format  2E15.6. 

The  features  of  the  PREDICTION  ERROR  FREQUENCY 
DISTRIBUTION  are  explained  in  detail  in  paragraphs  D  and  E  of  Section 
VI. 2. a. (3).  The  bar  chart  givos  a  graphical  representation  of  the 
distribution  of  the  prediction  errors.  Each  prediction  error  is 
represented  by  an  X.  Should  the  number  of  prediction  errors  in  any 
interval  be  greater  than  60  (thereby  exceeding  the  space  provided  for 
the  X's),  an  asterisk  is  printed  at  the  end  of  the  60  X's.  For  the 
purpose  of  easier  reading,  the  bar  chart  is  printed  to  the  right  of  a 
column  of  "I"s,  one  "I"  for  each  of  the  30  intervals. 
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The  entries  for  the  CHI-square  contribution,  the 
OBServed  FRequencies  and  the  EXPecteD  FRequencies  are  discussed,  together 
with  the  establishing  of  the  m  new  intervals,  in  paragraphs  F  and  G  of 
Section  VI. 2. a. (3).  In  paragraphs  E  and  H  of  that  section  the  checks 
are  discussed  which  lead  to t he  possible  printout  "CHISQUARE  COULD  NOT 
BE  COMPUTED." 

The  AVERAGES  OF  INDEPENDENT  VARIABLES  AND 
DEPENDENT  VARIABLE  are  printed  only  once  per  regression  problem  and 
are  numbered,  accordingly,  from  1  to  N+l,  such  that  the  average  of  the 
dependent  variable  is  numbered  N+l. 

D.  Predictions  at  Selected  Input  and/or  Synthetic 
Design  Points  (page  93) .  Predicted  values  and  standard  deviations  at 
selected  input  design  points  and/or  synthetic  design  points  are 
optionally  computed  and  printed  for  the  main  run  and  hand  selected 
reruns  only  (see  Card  Type  2,  columns  8-13).  They  cannot  be  obtained 
for  IVOR  or  BIVOR  reruns. 

The  coordinates  of  the  OCIV's  for  the  SELECTED 
INPUT  DESIGN  POINTS  and/or  the  SYNTHETIC  DESIGN  POINTS  are  printed  for 
ease  in  identifying  which  points  were  selected  and/or  specified, 
respectively.  In  the  general  formulation,  the  selected  input  design 

points  are  renumbered  1,  . ,  q,  . ,  Q;  whereas  the  synthetic 

design  points  are  consecutively  numbered  Q+l,  ....,  q',  ....,  Q'. 

The  coordinates  are  renumbered  1',  2',  ...  in  order  to  indicate  that 
these  are  the  coordinates  of  the  OCIV's  contained  in  the  IVS  of  the 
run. 


For  each  of  the  design  points,  selected  or 
synthetic,  the  PREDICTED  VALUE,  f ( p } ,  and  the  PREDICTION  STANDARD 
DEVIATION  FOR  THE  PREDICTION  LINE,'s(p),  or  the  ’’REDICTION  STANDARD 
DEVIATION  FOR  INDIVIDUAL  OBSERVATIONS,  s('p),  a.-®  printed.  The  index 
"(p)"  refers  to  the  number  ("(q) "  or  "(q1)")  of  the  point  in 
the  set  of  the  OCIV  coordinates  printed  previously  and  is  given  under 
the  heading  ITEM  NUMBER. 

Either  S(?)  or  S(p),  but  not  both,  can  be 
obtained  in  a  given  problem.  (See  Card  Type  2,  column  14.)  Should, 
however,  both  standard  deviations  be  desired,  the  one  that  is  not 
printed  can  obviously  be  obtained  as  follows: 

If  s<  p  j  is  printed:  s/„-;  =  v'/(7(p))2  +  s2 

If  s('p)  is  printed:  s(p)  =  /(s('p))2  -  s2 
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(Note:  The  standard  deviations  s(p)  and  S('p),  as  given  in  the  printout 
formulation,  are  actually  computed  by  the  program  in  the  ’’adjusted" 
form,  i.e.,  for  example, 


s(  p ) 


s 


K 

I 

v=l 


K 

L‘  cv  v'  (xv  (  p  ) 
vJ=l 


Xv)(^(p)  -  Xv*), 


where 


Xv 


1  n 

n  2  Xy  1  .) 
i=l 


The  standard  deviations  will  be  useful  if  one 
wants  to  construct  (1  -o)%  confidence  limits,  Lj_a>^p^  for  the  prediction 
1 ine ,  i.e., 


Li- a,  (  p )  -  ?(p)  i  s(p)ta-§.  n-K-x  » 

or  (l-<y)7o  "tolerance"  limits,  for  individual  future  observations, 

i.e., 

L'l-a.  (p)=  p  )  i  s(  p  )  tl-|»  n-K-l  * 

The  synthetic  design  point  feature  can  also  be 
useful  just  for  obtaining  the  predicted  values  of  the  regression 
equation  for  design  points  other  than  those,  originally  input.  In 
other  words,  the  feature  can  be  advantageously  applied  for  interpolation. 

At  the  end  of  the  "Main  Body,"  the  computer  time 
required  to  perform  all  of  the  calculations  for  this  run  is  printed: 

"RUN  (number)  TOOK  .  SECONDS."  The  main  run  is  identified  as  run 

0,  the  first  rerun  as  run  1,  etc. 

VI.3.b.(3)  Analysis  of  Variance  Tables 

For  each  run  (main  run  or  rerun)  an  analysis  of 
variance  table  (page  94)  is  printed.  The  essential  statistics  of  the 
run  are  given  in  analysis  of  variance  form,  including,  at  the  bottom, 
the  estimated  regression  equation  for  that  run.  The  terms  contained 
in  these  tables  are  taken  from  the  results  of  the  computations 
previously  performed.  The  definitions  of  the  terms  are  given  in  the 
"Various  Statistics"  part  of  the  Main  Body,  see  paragraph  B  of  Section 
VI .3 .b . (2) .  The  two  mean  squares  ("MS")  and  the  F  value  are  computed 
specifically  for  this  table. 

It  must  be  emphasized  that  each  analysis  of  variance 
table  has  its  own  error  term  based  on  n-K-l  degrees  of  freedom.  The 
two  blank  rows,  each  headed  by  the  word  "REGRESSION,"  are  available 
for  convenience  in  case  the  user  wishes  to  calculate  (by  hand)  a  main 
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theorem  F  value  (III-l)  for  testing  a  specific  hypothesis.  For  an 
example  of  this,  see  the  corresponding  printout  of  the  Example  Problem, 
Section  VI. 5. 


The  subscripts  of  the  independent  variables  in  the 
regression  equation  are  the  original  numbers  of  the  IV's  as  input  for 
the  main  run.  (This  is  different  from  the  Main  Body  in  which  the  K 
IV's  in  the  IVS  are  renumbered  from  1  to  K.)  For  example,  if  IV 
Number  v  is  not  included  in  the  IVS,  the  term  with  X(v)  is  not  present 
in  this  printout  of  the  regression  equation. 


VI.3.b.(4)  Final  Comprehensive  Analysis  Table 

The  Final  Comprehensive  Analysis  Table  (page  95)  gives 
the  F  values  (III-l)  of  the  main  theorem  FOR  REGRESSION  ON  DELETED 
VARIABLES  for  each  rerun,  together  with  the  COEFFICIENT  OF  DETERMINATION, 
the  NUMBER  ("NO."  =  DF  =  DEGREES  OF  FREEDOM)  OF  DELETED  VARIABLES  and 
the  identification  of  the  INDEPENDENT  VARIABLE  SELECTION.  Although 
implied  by  the  application  of  the  main  theorem,  it  is  emphasized  that 
all  F  values  are  based  on  the  error  term  of  the  main  run  with  n-N-1 
degrees  of  freedom.  The  table  is  also  a  very  convenient  means  to 
show  the  order  in  which  the  independent  variables  are  ranked  by  IVOR 
and/or  BIVOR  if  these  options  are  exercised.  There  is  a  certain 
danger  of  misinterpretation  of  the  BIVOR  final  comprehensive  analysis 
when  a  BIVOR  independent  variable  selection  is  rejected  only  on  the 
grounds  of  failing  the  identity  matrix  checks.  In  this  case  the  right¬ 
most  IV  is  deleted  from  the  model,  which  might  appear  as  a  genuine 
BIVOR  ordering  of  this  independent  variable  if  one  judges  from  the 
final  comprehensive  analysis  Cable  only.  For  more  details  see  para¬ 
graph  (A.b)  of  Section  VI. 2. d. 

Should  th»  Final  Comprehensive  Analysis  not  be 
printed  (but  reruns  are  present),  the  statement  "IK)  FINAL  COMPREHENSIVE 
PRINTOUT  SINCE  MATRIX  FOR  MAIN  RUN  COULD  NOT  BE  INVERTED"  is  given. 


VI. 4  Running  Time  Formulae 

The  formulae  of  this  section  give  the  approximate  times  (in 
seconds)  which  are  required  by  the  IBM  7030  STRETCH  computer  to  execute 
the  various  parts  and  options  of  the  DA-MRCA  program.  In  these  formulae 
the  time,  T  (in  seconds),  is  expressed  in  terms  of  the  input  parameters 
N,  N',  IQ,  and  n,  where 

N  c  number  of  IV's  contained  in  Che  model  of  the  main  run, 

N'  =  number  of  IV's  contained  in  the  model  of  any  (hand  selected) 
rerun. 
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IQ  =  number  of  IV 's  to  be  ordered  by  IVOR,  and 
n  -  number  of  data  points  input. 

The  formulae  are  based  upon  the  results  of  a  time  study  in  which 
a  series  of  regression  problems  was  actually  computed  by  the  program. 

In  this  study,  each  regression  problem  represented  a  unique  combination 
of  the  values  of,  at  the  most,  three  of  the  input  parameters  N,  N1,  IQ, 
and  n;  and  from  each  problem  the  time(s)  required  for  the  computations 
were  recorded.  The  ranges  of  the  four  parameters  were  taken,  in  the 
time  study,  as  they  are  likely  to  occur  in  actual  regression  problems. 

N  and  N'  were  varied  over  the  full  range,  that  is,  up  to  the  capacity 
of  the  program  which  is  N=50  independent  variables.  IQ  took  the  values 
2,  4,  8,  and  16;  and  the  numbers  of  data  points,  n,  were  60,  120,  240, 
and  480. 


Then  DA-MRCA  was  used  to  fit  polynomials  in  N,  N',  IQ,  n  (as 
applicable)  to  the  responses,  T,  i.e.,  to  the  actual  running  times 
observed.  (In  terms  of  the  present  report,  T  was  the  "dependent" 
variable  and  N,  N',  IQ,  and  n  were  the  "OCIV's.")  As  a  matter  of 
fact,  both  IVOR  and  BIVOR  were  employed  to  evaluate  the  most  efficient 
polynomials  for  the  prediction  of  the  running  times. 

The  coefficients  in  these  polynomials  (i.e.,  the  "regression" 
coefficients)  were  rounded  such  that  the  formulae  give,  in  general,  a 
safe  upper  limit  for  the  running  times. 

Little  is  known  about  extrapolation  with  respect  to  n,  the  number 
of  data  points.  However,  since  4  points  have  been  used  within  the  range 
of  the  study  (0  ---  n  *•  480),  thus  allowing  a  3rd  order  polynomial  in  n 
to  be  fitted,  some  extrapolation  should  be  permissible. 

The  formulae  are  as  follows: 


a.  Time  (in  seconds)  for  the  main  run,  excluding  the  option 
for  predicted  values  and  prediction  standard  deviations  at 
selected  input  and/or  synthetic  design  points: 


T,  =  2  ♦  r&  -  -Sn_] 

1  1000  1000 


(VI- 18) 


b.  Time  (in  seconds)  for  one  hand  selected  rerun  with  N'  IV's 
contained  in  the  model,  excluding  the  options  for  (1)  predicted 
values,  prediction  errors,  and  the  normality  test,  and  (2)  pre¬ 
dicted  values  and  prediction  standard  'deviations  at  selected 
input  and/or  synthetic  design  points: 


Ta  .ism 

a  1000 

(T2  =  17  seconds  for  N'  =  49) 


(VI-19) 
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c.  Time  (in  seconds)  for  the  option  for  predicted  values, 
prediction  errors,  and  the  normality  test  for  one  hand  selected 
rerun: 


T3  =  (VI-20) 

1000 

d.  Time  (in  seconds)  for  the  Final  Comprehensive  Analysis 
computations  for  M  hand  selected  reruns: 

T4  =  |  (VI-21) 


e.  Time  (in  seconds)  for  one  IVOR  sequence  in  which  only  the 
first  IQ  most  important  IV' s  out  of  N  are  ordered,  including 
the  computations  for  the  IVOR  Final  Comprehensive  Analysis  and 
excluding  the  main  run  and  the  option  for  predicted  values, 
prediction  errors,  and  the  normality  test: 


T& 


8(IQ)3N 

1000 


(T6  =  1002  seconds  for  IQ=N=50) 


(VI-22) 


f.  Time  (in  seconds)  for  one  BIVOR  sequence  in  which  all  N 
IV' s  are  ordered,  including  the  computations  for  the  BIVOR 
Final  Comprehensive  Analysis  and  excluding  the  main  run  and  the 
option  for  predicted  values,  prediction  errors,  and  the  normality 


test: 


T*  =  5  + 


2r 


1000 


(VI-23) 


(Tt^  =  255  seconds  for  N=50) 


g.  Time  (in  seconds)  for  the  option  for  predicted  values, 
prediction  errors,  and  the  normality  test  in  one  IVOR  sequence 
in  which  only  the  first  IQ  most  important  IV 's  out  of  N  are 
ordered: 


=  (IQ+1)  [1  + 


0 . 35n  (NH) 
1000 


(VI-24) 


h.  Time  (in  seconds)  for  the  option  for  predicted  values, 
prediction  errors,  and  the  normality  test  in  one  BIVOR  sequence: 


T* 


(N+l) 


[1  + 


0.35n  (N+l) 
1000  J 


(VI-25) 


Some  discussion  of  these  formulae  seems  to  be  appropriate. 

Tj ,  T. . ,  and  T-  each  contain  a  constant  term  which,  although  of 
lesser  importance,  was  not  considered  small  enough  to  be  neglected. 

In  Tt  t  lie  term  ^2-  should  probably  be  subtracted  from  8  only  if 
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n  is  smaller  than  500  and  be  disregarded  otherwise.  (Tx  as  given  in 
(VI-18)  has  its  maximum  at  n=800.)  Since  the  polynomial  was  fitted 

only  for  the  range  0  <  n  £  480,  this  rule  seems  to  give  some  safe 

margin  for  extrapolation  beyond  n=480,  and  the  formula  would  read, 
for  these  larger  values  of  n,  as: 

t  _  9  4.  ShN 

1  =  1000  • 

For  obvious  reasons,  only  Ta ,  T3 ,  T7 ,  and  T9  depend  upon  n, 
the  number  of  data  points  input,  while  the  other  4  time  formulae  do 
not  contain  n.  For  T2 ,  T5 ,  and  T^  ,  the  maximum  numerical  values  are 
given,  in  order  to  indicate  the  speed  of  the  program  with  respect  to 
reruns . 


The  comparison  of  T^  and  Ts  shows  that  a  full  IVOR  sequence  (with 
IQ=N)  takes  approximately  4  times  the  time  of  a  full  BIVOR  sequence. 
Naturally,  T5  is  strictly  valid  only  for  IQ  :£  16;  however,  it  can  be 
assumed  that  it  is  approximately  valid  also  for  the  whole  range,  i.e., 
IQ  <  50. 

T3  and  T-  were  obtained  without  the  grouping  of  IV1 s  in  IVOR  and 
BIVOR.  This  means  that,  if  grouping  is  applied  in  these  options,  the 
running  times  will  be  less  than  given  by  Tb  and/or  Te . 

Obviously,  T-,  and  T^  are  identical  for  IQ=N . 

No  formulae  have  been  evaluated  for  the  option  to  compute 
predicted  values  and  prediction  standard  deviations  at  selected  input 
and/or  synthetic  design  points. 

The  actual  running  times  of  the  various  parts  of  the  example 
problem  in  Section  VI. 5  may  serve  as  examples  of  the  application  of 
the  formulae.  In  the  example  problem,  the  parameters  take  the 
following  values: 

N  =  9 

N'  -  3  (in  M=1  hand  selected  rerun) 

IQ  =  4 
n  =  20 

This  gives  the  following  times: 


a . 


T, 


+  imm 

1000 


[8 


3.42 


(The  actual  time  for  "RUN  0",  inc luding  predicted  values  and 
prediction  standard  deviations,  was  4.03  seconds.) 
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b. 


T2 


.  1ZIQ!1 

1000 


0.06 


c-  = 

3  1000 

(T3  +  T3  =  0.10,  but  "RUN  1"  included  predicted  values  and 
prediction  standard  deviations  and  actually  took  2.10  seconds.) 


d. 


T*  = 


f  =  0.50 


f . 


Tr  =  2  +  _  3  15 

1000 

T6  =  5  +  ^^-=6.46 
6  1000 


h. 


T7  =  (4+1)  [1  +  ]  _  5.35 

1000 

Ta  =  (9+1)  [1  +  i.°.,35>C2P10f.U]  =  io.70 

1000 


This  gives  a  total  of 

8 

I  T,  =  29.68  seconds. 

J-l 

The  actual  "TOTAL  PROBLEM  RUNNING  TIME"  was  29  seconds.  The  latter 
time  included  the  predicted  values  and  prediction  standard  deviations 
at  2  selected  input  design  points  and  3  synthetic  design  points  in  the 
mam  run  and  in  the  only  hand  selected  rerun,  which  seems  to  compensate 
for  the  time  saving  in  IVOR  and  BIVOR  due  to  the  grouping  feature  as 
applied  here  but  not  considered  in  the  time  formulae. 


VI. 5  Example  Problem 

The  example  regression  problem  contained  in  this  section  is 
given  in  order  to  illustrate  the  various  capabilities  of  the  DA»MRCA 
program  and  to  exhibit  a  sample  of  the  program  output. 

The  data  of  the  example  problem,  as  listed  in  the  table  below, 
was  taken  from  Duncan  [1959J,  p.  697.  This  was  done  in  preference  to 
fabrication  of  artificial  variables  and  data,  and  the  example  was 
selected  as  a  representation  of  a  typical  regression  problem. 
(Naturally,  no  attempt  is  made  to  find  a  practical  solution  to  any 
aspect  of  the  general  ballistic  problem.) 
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There  are  n=20  data  points  in  the  problem.  Each  one  consists 
of  (a)  the  coordinate  of  the  dependent  variable,  y  =  "Ballistic  Limit", 
which  is  a  measure  in  ft. /sec.  of  the  projectile  velocity  required  to 
penetrate  armor  plate;  (b)  the  coordinate  of  the  first  OCIV,  Xj  = 
thickness  of  plate  in  inches;  and  (c)  the  coordinate  of  the  second 
OCIV  Xa  =  Brinnell  hardness  number  of  the  plate  material. 


y 

Ballistic  Limit 
in  Feet/Sec. 

xi 

Thickness  in 
Inches 

x;.- 

Brinnell 
Hardness  No. 

927 

.253 

317 

978 

.258 

321 

1,028 

.259 

341 

906 

.247 

350 

1,159 

.256 

352 

1,055 

.246 

363 

1,335 

.257 

365 

1,392 

.262 

375 

1,362 

.255 

373 

1,374 

.258 

391 

1,393 

.253 

407 

1,401 

.252 

426 

1,436 

.246 

432 

1,327 

.250 

469 

950 

.242 

275 

998 

.243 

302 

1,144 

.239 

331 

1 ,080 

.242 

355 

1,276 

.244 

385 

1,062 

.234 

426 

The  input  preparation  for  the  example  problem,  baaed  on  this 
data,  is  exemplified  in  Section  V.3. 

The  GC IV '  s  generated  are  x;x3,  x*  ,  xf ,  x*x.  ,  x4xf,  x^ ,  and  x'^  . 
Both  ranking  options,  IVOR  and  BIVOR,  are  exercised.  There  are  Mj«2 
groups  of  IV 's  specified  in  IVOR:  the  two  OCIV's  x.  and  x: ,  are  in  the 
first  group  and  the  7  GCIV's  are  in  the  second  group.  Only  IQ*4  IV's 
arc  to  be  ranked.  Under  the  restriction  due  to  grouping,  these  4  IV's 
will  include  the  two  TCtV's  of  the  first  group  (to  be  ranked  among 
themselves)  and  the  two  most  important  GCIV's  of  the  second  group.  In 
BIVOR,  there  are  ft-  -3  groups:  the  two  OCIV's  are  in  the  first  group, 
the  three  GCIV's  of  second  order  are  in  the  second  group,  and  the  four 
GCIV's  of  third  order  are  in  the  third  group.  For  the  other  specifi¬ 
cations  see  Section  V.3. 
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Pertinent  comments  in  handwriting  are  added  to  the  computer 
printout  exhibited.  Due  to  space  limitations  the  printout  is  not 
complete,  some  printout  having  been  deleted.  Whenever  this  applies, 
an  appropriate  comment  is  made. 

The  two  IVOR  analysis  of  variance  tables  exhibited  are  used  to 
show  the  type  of  hypothesis  testing  which  can  be  conveniently  achieved 
with  these  tables.  The  example  null  hypothesis  is  that  fitting  xx 
(plate  thickness)  in  addition  to  x2  (Brinnell  hardness)  does  not 
significantly  reduce  the  error  sum  of  squares.  This  hypothesis  is 
rejected  at  the  0.05  level  of  significance,  which  implies  that 
including  xx  in  the  model  in  addition  to  x2  does  improve  the  fit 
s ignif icantly . 

On  the  page  where  the  final  comprehensive  analysis  table  is 
printed  some  interpretation  is  given  of  the  rankings  of  the  IV's 
resulting  from  IVOR  and  BIVOR.  The  IVS  column  is  repeated  in  hand¬ 
writing  in  order  to  clearly  identify  the  IV's  additional! v  included 
(symbol  "0")  and  deleted  (symbol  "1")  in  conseciiti  ■/  steps  of  IVOR 
and  BIVOR,  respectively. 

If  the  analyst  wants  to  determine  a  "significant  model"  from  each 
of  these  rankings,  he  may  choose  a  significance  level  for  the  F  value 
("for  regression  on  deleted  variables")  and  determine  the  model 
accordingly.  The  analyst  must  be  aware  that  such  a  model  may  depend 
upon  the  grouping  of  the  IV's.  For  example,  in  the  IVOR  ranking  of 
the  present  example,  any  significant  model  including  any  IV  of  the 
second  group  must  necessarily  also  include  the  two  OCIV's.  It  could 
be  imagined  that  without  grouping,  one  of  the  two  OCIV's  might  not 
have  been  considered  part  of  the  significant  model. 

With  P=0.05,  say,  as  the  chosen  significance  level,  the 
"significant  models"  from  the  two  rankings  are  determined  as  follows. 

The  last  and  first  significant  F  value  in  IVOR  and  BIVOR,  respectively, 
is  F,  =  3.384  with  7  and  10  degrees  of  freedom.  (The  tabled  f  value 
for  7  and  10  degrees  of  ircedom  at  too  0.05  significance  level  is  3.14.) 
This  leads  to  a  "significant  model"  from  IVOR  which  includes  x# ,  Xj  ,  and 
XjX-  ,  with  an  associated  coefficient  of  determination  (R  )  equal  to  0.76. 
The  "significant  model"  from  BIVOR  includes  x;. ,  x:,  and  x$  Xj> ,  with  R*  - 
0.75.  Thus  the  two  "significant  models"  differ  only  in  their  least 
important  IV's,  which  might  be  due  to  the  different  groupings  used  in 
IVOR  and  BIVOR.  (Because  of  the  grouping  in  BIVOR,  xxx;  had  i  be 
deleted  in  one  of  the  first  four  steps.) 

For  a  comparison  of  the  actual  times  use a  by  DA-MRCA  to  compute 
(and  print)  the  various  parts  of  the  problem,  with  the  times  predicted 
by  the  formulae  given  in  Section  VI. 4,  see  the  end  cf  that  section. 
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VII.  FAILURE  ANALYSIS 


This  chapter  is  concerned  with  failures  which  may  occur  in  the 
use  of  the  DA-MRCA  program.  In  this  context,  a  "failure"  is  defined 
in  a  very  broad  sense:  It  is  meant  to  include  all  cases  in  which  the 
user  receives  an  output  from  the  program  which  is  principally  different 
from  what  he  expected  to  receive  and  what  he  was  justified,  from  his 
own  good  judgment,  to  expect. 


VII. 1  Classification  of  Failures 


The  program  user  probably  will  encounter  cases  in  which  the 
desired  results  of  the  regression  analysis  cannot  be  obtained  in 
specific  runs.  The  program  will  indicate  this  failure  (a)  by  stating, 
in  some  form,  that  the  inverse  of  the  matrix  of  the  normal  equations 
could  not  be  obtained,  or  (b)  by  making  a  statement  that  the  calculated 
identity  matrix  failed  the  accuracy  check  on  the  main-diagonal  element 
deviations  from  1.  (For  details  about  the  statements,  see  Section  VI. 2.) 
Sometimes  an  inverse  is  obtained  by  the  program  although  the  user  knows 
that  the  matrix  is  singular.  This  type  of  failure,  however,  should 
always  become  obvious  by  the  accuracy  checks  on  the  identity  matrix. 

In  this  chapter  the  above  indicated  failures  and  their  causes, 
as  far  as  they  are  known  to  the  authors,  are  analyzed  and  some 
corrective  measures  are  discussed  which  the  user  might  apply  in  order 
to  obtain  the  desired  problem  solution.  It  can  generally  be  stated 
that  the  failures  are  caused  by  inherent  computer  inaccuracies.  The 
only  exception  is  when  no  inverse  is  obtained  because  there  are  unknown 
linear  dependencies  among  the  rows  or  columns  of  the  matrix  of  the 
normal  equations. 

The  chart  given  on  the  following  page  represents  a  classification 
of  possible  failures  and  their  causes.  The  chart  should  be  self- 
explanatory;  the  causes  as  indicated  in  the  appropriate  boxes  are  defined 
and  discussed,  along  with  some  corrective  measures,  in  Section  VII. 2. 

The  authors  do  not  claim  that  the  list  of  causes  is  complete;  however, 
all  causes  known  to  the  authors  are  given. 

In  the  main  area  of  failures,  where  the  matrix  is  expected  to 
invert  and  the  calculated  identity  matrix  is  expected  to  pass  the 
accuracy  checks  (first  two  rows  of  the  chart),  the  analyst  will  be 
unable  to  readily  identify  the  cause(s)  of  the  program  failure  since 
he  cannot  be  certain  that  theoretically  there  is  a  solution.  However, 
by  following  the  suggested  corrective  measures  to  be  discussed,  he 
may  be  able  to  btain  a  solution  and  thereby  to  identify  the  cause(s) 
of  the  original  failure. 
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The  user  of  the  program  might  ask  why  he  should  encounter  the 
case  in  which  the  matrix  is  not  expected  to  invert  (last  row  of  the 
chart)  when  in  fact  theoretically  there  is  no  solution  but  the  program 
yields  an  inverse.  (Such  an  inverse,  however,  will  be  identified  as 
fictitious  by  the  inaccurate  identity  matrix.)  This  case  may  indeed 
occur,  for  example,  in  the  main  run,  when  the  analyst  specifies  a 
series  of  feasible  independent  variable  selections  (by  hand)  from  an 
original  set  of  N  independent  variables  where  N  is  larger  than  or 
equal  to  the  number,  nN ,  of  distinct  input  design  points. 

It  is  important  to  note  that  obtaining  an  inverse  in  such  a 
situation  constitutes,  from  the  analyst's  point  of  view,  a  failure 
with  respect  to  what  should  be  expected  from  the  program.  The  event 
of  obtaining  this  kind  of  fictitious  inverse,  therefore,  has  its 
proper  place  in  the  failure  chart. 


Failure  Chart* 


Matrix  inverts  but 
identity  matrix 
fails  accuracy 
check 

Matrix  does  not 
invert 

Analyst  expects 
the  matrix  to 
invert  and  the 
identity  matrix 
to  pass  accuracy 
check  (since 
there  are  no 
obvious  linear 
dependencies) 

Theoretically 
there  is  a 
solution 

Cause  of 

failure: 

Limited  compul 

:er  accuracy 

Theoretically 
there  is 
no  solution 

Cause  of  failure: 

Cause  of  failure: 

Non-obvious 
linear  dependencies 
plus 

truncation  errors 

Non-obvious 
linear  dependencies 

Analyst  does 
not  expect  the 
matrix  to  invert 
(since  there  are 
obvious  linear 
dependencies) 

Theoretical ly 
there  is 
no  solution 

Cause  of  failure: 

truncation  errors 

*  For  the  definitions  of  the  terms  used  in  the  Failure  Chart  see  the 
remaining  sections  of  this  chapter. 
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VII  .1  ins-.usjion  o  i  F  a  i  inr«  Causes,  Some  Correct  tv*-  Measures, 
and  Examples 

In  this  section  the  three  failure  causes,  i  e.,  limited  computer 
accuracy,  linear  dependencies,  and  truncation  errors,  wiLl  be  discussed 
and  some  corrective  measures  and  examples  bt  given. 


VII. 2. a  Limited  Computer  Accuracy 

As  is  well  known,  no  computer,  large  as  it  may  be,  is  an 
"ideal  computer,"  that  is,  a  computer  with  absolute  accuracy.  The 
inaccuracy  of  the  IBM  7030,  for  example,  with  its  error  in  the  four¬ 
teenth  decimal  digit  (when  using  single  precision  as  done  in  the 
present  program),  is  large  enough  to  effect  the  matrix  inversion 
calculations  to  the  extent  that  the  inverses  of  large  matrices  might 
be  worthless.  Without  presenting  the  details  of  the  error  propagation 
as  present  in  the  modified  Gaussian  elimination  method  used  in  the 
program,  it  can  be  stated  that  most  errors  are  introduced  by  the 
subtraction  of  large  numbers  from  other  large  numbers  where  these 
numbers  differ  only  in  the  last  few  digits.  These  digits  may  well  be 
bovond  the  last  accurate  one,  i.e.,  beyond  the  thirteenth  digit  at 
the  start  of  the  calculations.  One  consequence  of  this  may  be,  for 
example,  the  appearance  of  one  or  more  negative  elements  in  the  main 
diagonal  of  the  inverse,  leading  to  the  program  statement  that  an 
inverse  could  not  be  obtained.  Another  consequence  could  be  that, 
although  the  inverse  can  be  obtained,  the  calculated  identity  matrix, 

I.,  deviates  from  the  true  identity  matrix  such  that  the  accuracy 
checks  on  the  main  diagonal  elements  of  I.  fail.  This  "limited 
computer  accuracy"  will  cause  failures  most  often  in  polynomial 
regression  with  high  order  terms  contained  in  the  model.  At  this 
point  it  must  be  recalled  that  the  criterion  by  which  the  program 
accepts  or  rejects  a  run  is  dependent  upon  the  analyst's  choice. 

That  is,  the  program  user  chooses  the  value  of  1(2)  which  will  be  the 
critical  value  not  to  be  exceeded  by  the  deviation  (from  l)  of  any  main 
diagonal  clement  of  the  calculated  identity  metrix.  (See  Section  Vl.l.b.) 

As  <t  corrective  measure  to  overcome  the  failures  caused 
by  the  limited  computer  accuracy  the  following  transformation  of  the 
independent  variables  is  sometimes  sufficient: 

v  *  *1*  •  (VU-l) 

K 

This  transformation,  which  Is  often  also  referred  to  as  "coding"  of 
the  x's,  is  essentially  a  standardisation,  with  centralization  effected 
by  the  subtraction  of  the  average,  x,  from  the  original  observation,  x, 
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and  with  |v  |<  1  effected  through  division  by  the  range  Rx  =  x„ax  -  xBln 
The  transformation  will  be  applied  only  to  the  "original"  independent 
variables  (OCIV's),  and  in  polynomial  regression,  all  higher  ordef  and 
cross-product  terms  (GCIV's)  will  be  generated  from  the  v  variables. 

(As  can  easily  be  seen,  if  the  GCIV's  were  also  transformed,  the  matrix 
of  the  normal  equations  would  have  characteristics  similar  to  those  of 
a  Hilbert  matrix.)  The  transformation  has  the  effect  of  keeping  close 
to  zero  those  elements  in  the  matrix  of  the  normal  equations  which,  in 
polynomial  regression,  are  sums  of  odd  powers  of  the  v  values  (£v3  s«  0, 
for  example) ,  or  those  elements  which,  in  general  multiple  regression, 
are  proportional  to  the  covariance  of  twc  uncorrelated  independent 
variables  (Ev^a  0,  for  example) .  The  other  elements  of  the  matrix, 
for  instance,  the  sums  of  the  even  powers  in  polynomial  regression,  are 
kept  small  by  the  transformation  because  of  |  v  |  <  1 .  The  transformation 
then,  results  in  sufficiently  large  contrasts  among  the  matrix  elements 
of  now  smaller  absolute  value  such  that  the  subtractions  mentioned 
before  can  be  done  with  much  higher  accuracy. 

It  should  be  noted  that  the  adjustment  for  the  average 
x  value  as  achieved  in  the  v  transformation  leads  to  a  much  higher 
computational  accuracy  than  can  be  achieved  by  starting  with  the 
regression  model  (VI-2)  in  which  the  independent  variables  are 
adjusted  for  their  average  values. 

In  case  of  polynomial  regression  the  v  transformation  can 
become  problematic  to  the  program  user  who  needs  or  wants  prediction 
equations  in  the  original  x  space.  Only  under  a  rather  severe 
restriction  (to  be  defined)  will  the  regression  sum  of  squares  (ASSR) 
due  to  a  group  of  independent  variables  in  the  v  space  be  equal  to  the 
regression  sum  of  squares  due  to  the  corresponding  group  of  independent 
variables  in  the  x  space.  Before  defining  the  restriction,  a  very 
simple  example  is  given  in  order  to  illustrate  the  situation.  This 
example  contains  only  one  "original"  independent  variable,  x.  Imagine 
first  that  only  its  squared  term  (xa)  is  included  in  the  regression 
model.  The  regression  sum  of  squares  adjusted  for  the  mean,  ASSR, 
due  to  xa  is: 


assr(x®)  =  EL.??i.y_~y)  g.  . 

Z  (xa-xa)a 

Applying  the  v  transformation  to  x,  one  gets  for  the  corresponding 
regression  sum  of  squares  due  to  va: 


ASSR(va) 


[S  va(y-y)  ]a 
Z  (va-7)a 
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Since  v  =  ,  ASSR(v3)  can  be  rewritten  as 


ASSRCv3)  .  (lt'i°3  (yy)]g,  . 

Now  it  can  be  shown  that 

ASSR(xs)  4  ASSR(vs)  . 

For  this  it  is  sufficient  to  show  that  the  two  denominators  are  not 
proportional  to  each  other.  Indeed,  one  has  -■ 


E  <x3-x3)3  =  E  x  - 


[E  x3]a 


E  [(x-iO3  -  (x-x)ap  «  E  x4  -  +  6, 

n 

where  6  is  not  identically  aero: 

6  »  4nx  [-  (30 3  +  25cxff  -  x‘*  ]  f  0. 

Imagine  next  that  only  the  linear  terms,  x  or  v,  are 
included  in  the  two  models.  It  is  easy  to  show  that  the  two  regression 
sums  of  squares  are  now  equal: 


assr(x)  *=  jg  (*-x)(y-y)] 
E  (x-x)3 


IS 


ASSR(v)  = 


[E  (v-v)(y-y)]3 
E  (v-v)3 


Since  v=0,  one  has 


ASSR(v)  =  R*  i±  (— -  ASSR(x)  . 

RfE(x-^r 

Finally,  the  two  regression  sums  of  squares  are  again  equal 
when  both  the  linear  and  quadratic  terms  are  included  in  the  models: 

ASSR(x,x3)  =  ASSR(v,v3) . 

The  algebraic  proof  for  this  is  omitted  because  of  its  length. 
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More  generally,  it  can  oe  demonstrated  that  the  respective 
regression  sums  of  squares  in  the  x  and  v  space  are  equal  only  when  the 
polynomial  regression  models  of  order  k,  say,  also  include  all  terms 
of  lower  order  than  k: 

ASSR(x,x8 , . .  .  ,xk  ,x‘)  =  ASSRCvjV^ , . . . , v"  ",vK). 

This  condition  is  generally  also  valid  for  polynomial  regression  models 
in  more  than  one  original  independent  variable  For  example,  in  a 
case  of  two  original  independent  variables,  x-,  and  x2 ,  and  a  model  which 
is  to  include  the  cross-product  term  (x^Xp  or  v-jvs),  one  has  to  include 
also  the  linear  terms  (x,  and  xs ,  or  vx  and  vg ,  respectively)  in  order 
to  have  the  regression  sums  of  squares  equal  in  the  x  and  the  v  space: 

ASSRCxx ,x- ,x1x2)  =  A3SR(v: , v? , Vj v2) . 

This  leads  to  the  following  conclusion  When  the  program  user  finds, 
for  accuracy  purposes,  a  need  to  apply  the  transformation  (VI.I-1)  and 
when  he  wants  to  keep,  with  respect  to  the  regression  sums  of  squares, 
the  relations  between  corresponding  terms  of  the  two  polynomial 
regression  models  undisturbed  by  the  transformation,  he  must  follow 
this  Restriction:  A  polynomial  regression  model  must  contain  all 
polynomial  terms  (including  the  linear  terms)  which  can  be  separated 
as  factors  from  the  highest  order  terms  contained  in  the  model. 

The  program  user  can  easily  adhere  to  this  restriction 
when  linear  hypotheses  are  to  be  tested  by  the  option  for  hand  selected 
reruns.  When  the  user  wants  to  automatically  rank  the  transformed 
polynomial  terms  by  IVOR  or  BIVQR,  he  can  adhere  to  the  restriction  by 
application  of  the  grouping  feature  as  available  in  both  routines. 

For  this  the  polynomial  terms  should  be  grouped  according  to  their 
powersum  which  is  defined  to  be  the  sum  of  all  exponents  of  the 
original  independent  variables  contained  in  a  term.  For  example,  in 
a  polynomial  of  second  degree  in  two  (transformed)  independent  variables 
v i  and  vs ,  there  would  be  two  groups  in  IVOR  and  in  BIVOR:  v:  and  v? 
would  form  t he  first  group  with  a  powersum  of  1  in  each  term,  and  vf , 
v i V2  ,  and  v|  would  form  the  second  group  with  a  powersum  of  2  in  each 
term.  Since  the  ranking  begins  in  the  first  group  in  IVOR  and  in  the 
last  group  in  BIVOR,  it  can  be  seen  that  the  above  restriction  is 
followed.  It  is,  however,  obvious  that  the  restriction  is  being 
followed  in  an  overstrict  fashion:  When  in  BIVOR,  for  example,  v£ 
and  VjV3  have  been  found  to  be  the  least  important  terms  in  the  last 
(second)  group,  vf  is  ranked  automatically  as  the  next  least  important 
term.  In  reality,  at  this  step  both  vf  and  v2  should  be  "admissible" 
for  the  determination  of  which  term  contributes  less  to  the  regression 
sum  of  squares  when  contained  in  the  model.  Note:  In  N0VAC0M  (see 
Section  11,3)  a  BIVOR  type  ranking  procedure  can  optionally  be  performed 
such  that  at  each  step  ail  those  polynomial  terms  become  "admissible" 
for  ranking  which  cannot  be  separated  as  factors  from  other  terms  con¬ 
tained  in  the  model.  Therefore,  the  terms  become  admissible  in  the 
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desired  fashion,  that  Is,  according  to  the  above  restriction  tc  be 
followed  when  the  accuracy  transformation 


is  applied  and  when  the  models  in  the  x  and  in  the  v  space  are  to 
correspond  to  each  other. 

When  the  program  user  adheres  to  the  restriction,  he  will 
in  fact  have  a  model  (for  example,  a  significant  model)  which  corresponds, 
term  by  term,  to  the  model  in  the  original  space.  If  it  is  desired 
and  feasible,  the  program  user  can  then  retransform  the  values  of  the 
estimated  regression  coefficients  into  the  values  which  the  corresponding 
coefficients  have  in  the  original  space.  Naturally,  the  retrans formation 
is  very  simple  when  product  terms  are  not  included  in  the  model.  In 
this  case  the  regression  coefficients  of  the  original  space  are  obtained 
by  dividing  the  regress^  ju  .  efficients  of  the  transformed  space  by  the. 
respective  ranges  R.  In  general,  however,  one  would  make  use  of  the 
model  obtained  in  the  transformed  space  by  transforming  the  coordinates 
of  any  design  point  of  the  original  space  for  which  one  wants  to  compute 
the  predicted  value  of  the  dependent  variable  and/or  confidence  limits. 

Although  the  transformation  (VII-1) , 


seems  to  be  the  most  effective  one  to  increase  the  accuracy,  division 
by  a  constant  or  subtraction  of  a  constant  sometimes  is  satisfactory. 
Division  by  a  constant,  that  is  the  transformation  v'  =  avoids  the 
disadvantages  which  are  characteristic  of  the  transformation  (VII-1): 

The  retrans formation  of  the  model  consists  merely  of  dividing  the 
regression  coefficient  obtained  in  the  transformed  space  by  E .  In 
polynomial  regression  the  retrans formation  consists  of  dividing  the 
obtained  regression  coefficient  of  a  polynomial  term  by  the  corresponding 
product  of  the  E  values  used  in  the  transformation  of  the  original 
independent  variables.  For  example,  the  regression  coefficient  obtained 
for  the  term 

^3  ]  s 

Ei  \E3  / 

is  retransformed  by  dividing  by  Ej^E®. 

The  effect  of  the  ^  transformation,  with  respect  to 
accuracy,  is  similar  to  that  of  the  division  by  Rx  in  the  v  transformation: 
If  the  value  of  E  is  properly  chosen,  the  absolute  values  of  the  trans¬ 
formed  data  can  be  made  to  lie  between  0  and  1.  This  can  sometimes  be 


133 


NWL  REPORT  NO.  2035 


achieved  by  choosing  the  proper  power  of  ten  for  E,  in  which  case  the 
transformation  can  easily  be  executed  by  hand.  However,  this  trans¬ 
formation  is  of  little  value  if  all  or  most  of  the  untransformed  OCIV 
coordinates  are  of  equal  sign.  In  this  case  the  other  simple  trans¬ 
formation,  i.e.,  the  subtraction  of  a  constant  such  that  centralization 
is  achieved,  is  sometimes  sufficient.  The  constant  G  in  this  trans¬ 
formation,  v"  =  x-G,  should  be  conveniently  chosen  close  to  the 
average  of  the  x  values,  i.e.,  G  should  be  a  "working  average."  If 
it  is  appropriate  to  choose  G  as  a  whole  number,  this  transformation 
also  can  easily  be  performed  by  hand.  The  transformation  x-G  has, 
however,  the  same  type  of  side-effects  with  respect  to  the  retrans¬ 
formation  of  a  polynomial  model  as  were  shown  to  exist  for  the 

transformation 

Rx 

The  transformations 

X  -X  X 

v  =  and  v'  -  g  (but  not  v"  =  x-G) 

can  automatically  be  applied  to  the  coordinates  of  the  OCIV's  by  the 

preprocessor  program  MTRAN,  as  was  mentioned  in  Section  II. 2.  The 
output  of  MTRAN  may  be  on  cards  or  tape  and  represents  the  data  input 
for  DA-MRCA,  i.e.,  the  information  usually  puncl  ad  on  Card  Type  8. 

The  following  numerical  example  is_given  in  order  to 
illustrate  the  effects  of  the  transformation  The  problem  con¬ 

tains  one  original  independent  variable  x  with* *9  distinct  levels. 

In  the  x  space  a  polynomial  of  5th  degree  was  the  highest  that  could  be 
fitted  by  DA-MRCA,  whereas,  after  applicacion  of  the  v  transformation, 
a  polynomial  of  8th  degree  could  be  obtained.  (Naturally  in  this 
example,  this  is  the  zero  error  perfect  fit.)  The  printout  shown  is 
a  reproduction  of  a  part  of  the  original  printout  of  DA-MRCA  for  this 
example.  The  9  data  joints  are  given  below,  where  also  the  transformed 
(coded)  x  values  are  shown. 


y 

X 

v  '  TT7 

9.5 

47.30 

-.45861017 

0.6 

47.4/ 

-.45276825 

43.7 

54.65 

-.20603285 

49.9 

54.83 

-.19984729 

48.3 

bl  .90 

+.04310804 

65.5 

64.20 

+  .1221458  . 

96.4 

68.43 

+ ,26750b67 

128.5 

70.63 

+.34310804 

149.1 

76.40 

+.5413P98J 
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VII. 2. b  Linear  Dependencies 

Linear  dependencies  among  all  or  some  of  the  rows 
(columns)  of  t*.-  natvix  of  the  normal  equations  of  a  given  run  will 
cause  this  matrix  to  be  singular  and,  therefore,  fail  to  invert. 
Sometimes  a  fictitious  inverse  will  be  computed  by  the  program  because 
of  the  presence  of  truncation  errors,  see  Section  VII. 2. c  below.  In 
some  cases  the  analyst  will  be  able  to  infer,  from  visual  inspection 
of  the  number  and  the  relative  position  of  the  nK  distinct  input 
design  points,  as  given  in  the  design  matrix,  that  linear  dependencies 
are  present.  These  will  be  referred  to  as  "obvious"  linear  depend¬ 
encies.  They  occur,  for  example,  when  the  analyst  includes  as  many 
or  more  independent  variables  in  the  regression  model  of  a  given  run 
as  there  are  distinct  design  points.  For  a  discussion  of  some  obvious 
linear  dependencies  see  the  end  of  this  section. 

In  general,  the  linear  dependencies  will  be  "non-obvious" 
and,  therefore,  unknown  to  the  analyst  from  visually  inspecting  the 
design  matrix.  It  is  in  this  sense  that  the  linear  dependencies  are 
discussed  here  as  a  cause  for  a  failure.  The  algebraic  parts  of  the 
discussion  are  presented  in  terms  of  the  main  run;  however,  all 
conclusions  are  naturally  equally  valid  for  any  rerun. 

The  matrix  A  of  the  normal  equations  of  the  main  run  can 
be  expressed  in  terms  of  the  design  matrix  X  as  follows: 

A  «  X'X, 

with 


*01 

*n 

*ai 
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• 

where  x<j,  -l.  Since 

rank  [A]  -  rank  [X], 

X  must  be  of  rank  (HI  in  order  that  A  is  a  non-singular  matrix,  assuming 
that  ft*  ^  N+l.  By  definition,  X  is  of  rank  (Hi  when  no  linear  dependencies 
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exist  among  its  N+l  columns.  In  other  words,  as  soon  as  the  coordinates 
>  x2 »  xv »  •••»  xn)i  of  the  nN  distinct  input  design  points 

satisfy  the  identity 

N 

I  avxvl  -0  [i  -  1, . . . ,nN }  (VII-2) 

V*0 

with  at  least  two  coefficients,  av ,  being  different  from  zero,  the  rank 
of  X  is  smaller  than  N+l  and,  thereby,  A  is  singular.  In  a  geometrical 
inttupreiation,  the  identity 

N 

I  av  xv  j  0  { i } 

v=0 

means  that  all  nN  distinct  design  points  are  located  on  a  hyperplane 
in  the  N-dimensional  space  defined  by  the  N  independent  variables. 

(This  hyperplane  could  have,  at  the  most,  N-l  dimensions.)  Except 
for  the  cases  of  "obvious"  linear  dependencies,  the  analyst  will  not 
be  able  to  determine,  without  further  analysis,  whether  or  not 
the  n*  distinct  input  design  points  are  located  on  a  plane  in  the 
N-dimensional  space.  Should  he  want  to  determine  this  by  analytical 
means,  he  would  have  to  calculate  the  value  of  the  determinant  of  the 
matrix  consisting  of  any  N+l  rows  of  X  which  represent  distinct  design 
points.  This  can  be  a  considerable  effort.  In  the  present  program, 
therefore,  the  detection  of  this  general  case  of  "non -obVious"  linear 
dependencies  is  left  to  the  built-in  checks  for  the  possibility  of 
obtaining  an  inverse  and  to  the  checks  on  the  accuracy  of  the  calculated 
identity  matrix.  When  "non -obvious"  linear  dependencies  are  present 
for  a  given  independent  variable  selection  and  when  a  fictitious 
inverse  is  obtained,  the  main  diagonal  elements  of  the  calculated 
identity  matrix  will  deviate  rather  drastically  from  1  and  the  run 
will  clearly  be  rejected. 

The  only  adequate  corrective  measure  in  the  case  of 
non-obvlous  linear  dependencies  is  to  delete  one  independent  variable 
and  to  try  to  fit  the  reduced  regression  model.  As  discussed  in 
Section  VI. 2. d,  this  deletion  is  performed  autosMtlcally  in  the  BIVOR 
option.  IVOR,  by  nature,  has  an  advantage  over  BIVOR  in  the  handling 
of  non-obvlous  linear  dependencies  and  the  identification  of  perfect 
fits.  Since  in  BIVOR,  indlscrlmlnantly,  the  rightmost  independent 
variable  is  deleted  after  a  run  was  rejected,  this  deletion  does  not 
necessarily  eliminate  the  unwanted  non-obvlous  linear  dependency.  In 
fact,  there  could  be  many  such  deletions  of  rightmost  IV's  before  a 
perfect  fit  is  reached  by  BIVOR.  IVOR,  in  contrast,  will  select,  at 
each  step,  only  those  Independent  variables  for  possible  inclusion  into 
the  model  whose  inclusion  will  not  Introduce  linear  dependencies.  By 
this  technique  IVOR  is  capable  of  always  finding  the  perfect  fit  with 
the  maximum  number  of  independent  variables  contained  in  the  model . 
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Another  remark  regarding  linear  dependencies  concerns 
the  situation  in  which  functions  of  the  original  independent  variables 
are  added  to  the  model,  as  is  the  case,  for  example,  in  polynomial 
regression.  Namely,  it  is  wrong  to  assume  that  functional  terms  can 
always  be  added  when  there  are  no  (non-obvious)  linear  dependencies 
caused  by  the  original  independent  variables.  The  following  simple 
example  from  polynomial  regression  may  serve  to  illustrate  this  and 
the  concept  of  the  "non-obvious"  linear  dependency  in  general. 

Example .  Given  the  following  nN=4  design  points  in  the 
plane  of  the  two  original  independent  variables  xa  and  x2  , 


*1 

-1 

0 

+2 

+3 

X2 

1 

+1 

-2 

-2 

+  1 

the  regression  model  to  be  fitted  is,  say: 


Y  =  Bo  +  +  B2xs  *•  B3Xi, 

which  with  4  distinct  design  points  should  lead  to  a  "zero  error 
perfect  fit."  The  inclusion  of  the  term  xf  appears  to  be  feasible, 
but  it  nevertheless  leads  to  a  non-obvious  linear  dependency:  the 
4  points  {xx ,  xa ,  xf}  are  located  on  a  plane  in  the  3 -dimensional 
space.  As  can  easily  be  verified,  the  4  points  satisfy  the  identity 
of  form  (VII -2) ,  i.e.,  the  4  points  are  on  a  plane  having  this  equation: 

2  +  2xx  +  xa  -  xf  =  0 . 

It  is,  therefore,  not  possible  to  include  xf  in  the  regression  model 
when  X}  and  xa  are  Included. 

In  the  following,  some  "obvious"  linear  dependencies  are 
discussed,  two  of  which  are  derived  from  the  general  case,  i.e.,  by 
specifying  the  coefficients,  av ,  in  the  identity 

N 

£  av  Xy  |  "  0 . 

v»0 

All  these  cases  can  readily  be  Identified  from  the  design  matrix  X 
without  further  analysis.  As  has  been  the  case  previously  in  this 
section,  the  discussion  of  the  obvious  linear  dependencies  also  will 
be  presented  in  terms  of  the  main  run,  i.e.,  for  N  independent  variables. 
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Some  "obvious"  linear  dependencies: 

(1)  In  the  identity  (VII-2), 

N 

£  avxv  j  =0,  i  =  1,2, .. . ,nN , 
v=0 

all  coefficients  av  except  ao  and  a v*  are  zero: 


xv* i  ■  -  — -  =  constant, 
ay* 

This  means  that  the  coordinate  xv*i  is  equal  for  all  n«  distinct 
design  pointr .  4,1  '‘^dependent  variables,  xv*,  satisfying  this 

condition  must  be  deleted  from  the  model. 

(2)  In  the  identity  (VII-2),  all  coefficients  except 
av*  and  av**  are  zero: 

a v*xv*1  +  av**xv**{  0, 


or 


xv*t  av*e 

. . . . =  -  -  =  constant. 

XV**t  av* 

This  is  the  case  of  proportionality  for  all  nM  coordinates  xv*t  and 
xv**t .  One  independent  variable  out  of  each  pair  xv*»  xv**  satisfying 
this  condition  must  be  deleted  from  the  model. 

(3)  nN  £  N.  This  is  the  case  of  trying  to  fit  too  many 
independent  variables  for  the  number,  r.N,  of  distinct  input  design 
points  available.  It  will  be  met  mostly  in  situations  where  functions 
of  the  original  independent  variables  have  been  included  in  the  regression 
model,  as  is  the  case  in  polynomial  regression.  The  Identity  (VII-2) 
is  automatically  fulfilled  by  the  nK  design  points  since  all  n„ 
points  are  necessarily  located  on  a  "plane"  in  the  N-dlmensional  space 
defined  by  the  N  Independent  variables.  At  least  N-n*+l  independent 
varlable(s)  must  be  deleted  from  the  model  in  order  to  arrive  at  a 
solution. 


(4)  This  case  applies  only  when  functions  xv  -  fv  (sx, 

Sg,  . ..,  tj,...)  of  the  original  Independent  variables,  ,  are 
Included  In  the  model,  as  Is  the  case,  for  example,  In  polynomial 
regression.  It  Is  related  to  case  (3)  (nN  £  N)  and  defined  as  follows. 
Let  the  number  of  distinct  values  (coordinates)  of  the  original 
independent  variable  Sj  be  Lj.  The  set  of  all  functional  terms  xv  *  fv 
of  the  model  which  contain  cj  can  be  divided  Into  groups  such  that  a 
group  consists  of  all  those  terms  fv  which  contain  one  or  more  other 
variables  Sj*  (J*^j),  all  In  an  Identical  functional  form.  (The  terms 
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fy  in  one  of  these  groups  need  not  contain  any  variable  other  than  z: .) 
Let  the  maximum  number  of  terms  fy  in  any  group  be  Mj .  Then  an  obvious 
linear  dependency  exists  if  Lj  £  M« . 

As  a  complex  and  probably  unrealistic  example,  intended 
to  illustrate  the  above  definition,  imagine  that  the  model  Includes 
the  following  set  of  9  terms  all  of  which  contain  Zj(zj  #  Zj*  /  Zj**): 


Zj*  sin(zj),  z.*  sin(2zj),  zs *  sin(3Zj) ;  cos(zj),  cos(2z,); 
z ]**  cos(ZjZ fit);  Zj**  cos(zjZj*);  z***  cos(ZjZj*);  z ***  cos(zjZ,*). 

The  first  three  terms  contain  Zj*  in  an  identical  functional  form, 
namely  as  a  multiplier.  The  next  two  terms  do  not  contain  any  other 
variable  than  z3;  and  the  last  four  terms  each  contain  z,**  in  a 
different  functional  form.  This  makes  6  groups  with  3,  2,  1,  1,  1, 

1  terms,  respectively.  Therefore,  Mj  equals  3.  Should  the  number 
L;  of  distinct  values  of  z}  be  smaller  than  or  equal  to  3,  the 
inclusion  of  the  first  of  the  above  groups  (with  3  terms)  in  the 
model  would  lead  to  an  obvious  linear  dependency. 

In  this  case  of  L}  s  Mj  the  identity  (VII*2)  is  again 
automatically  fulfilled  since  the  total  number  nK  of  distinct  design 
points  will  be  located  on  a  "plane"  in  the  N-dimensional  space  defined 
by  the  N  independent  variables,  as  can  readily  be  verified.  For  each 
original  independent  variable  z}  for  which  Lj  s  Mj  is  true,  at  least 
as  many  terms  containing  Zj  per  group  must  be  deleted  from  the  model 
such  that,  at  the  most,  L.-l  terms  per  group  will  remain.  In  the  above 
example,  deletion  of  Zj*  sln(3tj),  say,  would  eliminate  the  obvious 
linear  dependency  if  Lj  is  assumed  to  be  exactly  3. 


VII. 2. c  Truncation  Errors 

Truncation  errors  are,  naturally,  present  In  all  computations 
performed.  As  Indicated  before,  these  errors  become  particularly 
Important  in  one  situation,  l.e.,  when  the  matrix  Is  singular  (obvious 
or  non-obvlous  linear  dependencies  being  present)  and,  consequently, 
an  inverse  does  not  exist*  In  this  situation  the  truncation  errors 
sometimes  lead  to  a  fictitious  Inverse  which,  however.  In  all  cases 
should  be  identified  as  such  by  the  failure  of  the  calculated  Identity 
matrix  to  pass  the  accuracy  checks.  This  fictitious  inverse  is  usually 
caused  by  an  element  of  the  main  diagonal  of  the  Inverse  which 
theoretically  has  the  value  aero  but  actually  equals  a  small  positive 
quantity  stemming  from  a  truncation  error.  One  can,  in  fact,  construct 
very  simple  cases  with  singular  matrices  for  which  the  computer  will 
obtain  fictitious  Inverses. 
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There  is  no  possibility  whatsoever  to  avoid  the  "failures' 
which  are  caused  by  these  errors  when  one  deals  with  singular  matrices 
The  analyst  has  to  rely  entirely  upon  the  accuracy  check  on  the  calcu¬ 
lated  identity  matrix  in  order  to  be  protected  from  this  type  of  a 
fictitious  problem  solution.  In  the  experience  of  the  authors  no 
actual  case  occurred  in  which  the  inverse  of  a  matrix  known  to  be 
singular  passed  the  identity  matrix  checks. 
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VIII.  FORTRAN  IV  DOCUMENTATION  OF  DA -MRCA 


In  previous  chapters  of  this  report,  references  to  problem 
variables  have,  in  most  instances,  been  made  in  terms  of  the  general 
mathematical  notation  used.  However,  in  the  programming  and  coding 
phases  of  the  DA-MRCA  program,  it  has  been  necessary  to  redefine  some 
of  these  .ariables  in  an  acceptable  FORTRAN  IV  variable  notation.  In 
addi.ion,  other  variables  have  required  initial  definition  due  to  the 
storage  allocation  conventions  of  the  FORTRAN  IV  language. 

Gome  of  thesf*  FORTRAN  variables  have  been  defined  in  previous 
chapters  of  this  report.  For  example,  variable  descriptions  are 
provide*  in  Chapter  V  (INPUT  PREPARATION).  However,  if  the  reader  has 
the  desire  or  need  to  study  and  understand  the  FORTRAN  formulation  of 
the  program,  additional  information  is  required  to  associate  the 
mathematical  concepts  with  the  FORTRAN  IV  documentation. 

This  chapter,  therefore,  presents  the  FORTRAN  IV  documentation 
of  the  DA -MRCA  program  in  the  form  of  a  glossary  of  program  variables, 
flow  charts,  conversion  notes,  and  a  complete  listing  of  the  program. 


VIII. 1  Description  of  Program  Variables 

In  this  section  are  defined  the  program  variables  which  are 
contained,  (a)  in  COMMON  storage,  (b)  in  the  MAIN  PROGRAM,  and  (c)  in 
program  subroutines . 

Input  variables,  indices  of  DO-loops,  most  variables  defined  in 
DATA  statements,  and  most  arguments  in  subroutines  are  not  defined  here. 


VIII .1. a  Variables  in  COMMON  Storage 

A  -  an  array  containing  the  matrix  (A)  of  the  normal  equations; 

sub-outine  GAUSS  changes  this  matrix  to  its  inverse. 

ARP  -  an  array  into  which  the  array  A  is  saved  before  subroutine 
GAUSS  is  called, 

AW  -  an  array  which  contains  averages  of  the  independent  variables 
and  the  dependent  variable. 


AW 


-  an  array  which  contains  averages  of  independent  variables  in 
subroutine  PREVAR  and  which  contains  the  various  regression 
sums  of  squares  adjusted  for  the  mean  in  subroutines  IV0R  and 
BIV0R. 


144 


tews 


NWL  REPORT  NO.  2035 


B 

BB 

BSDEV 

DETERM 

ERR0R 

IB  IDS 

ICASE 

ISKIP 

IT0TAL 

JLIM 

KMUM 

KNUM 

M 

Ml 

M4 

N 

NN 


-  an  array  containing  the  constants,  EVy,  of  the  normal  equations; 
subroutine  GAUSS  changes  this  vector  to  contain  the  solution 

of  the  normal  equations  (i.e.,  the  regression  coefficients). 

-  an  array  which  is  used  to  save  the  constants,  EVy,  of  the 
normal  equations. 

-  an  array  which  contains  the  standard  deviations  of  the 
regression  coefficients. 

-  the  determinant  of  A. 

-  a  variable  which  is  used  as  an  error  return  from  subroutines 
ABT  and  GAUSS  and  which  controls  printout  in  subroutine  REDUCM. 

-  a  variable  which  is  used  in  conjunction  with  IBID  to  control 
the  computation  and  checking  of  the  identity  matrix. 

-  a  counter  for  the  number  of  inverse  matrices  which  are  printed. 

-  if  the  main  run  was  rejected  for  any  reason,  ISKIP=2;  other¬ 
wise  ISKIP=1. 

-  initially  set  equal  to  the  rank  of  the  matrix  of  the  normal 
equations,  A,  for  the  main  run,  this  value  is  later  used, 
in  IV0R  and  BIV0R,  as  the  upper  limit  on  the  number  of 
independent  variables  at  various  steps  of  these  subroutines. 

-  a  variable  which  is  set  equal  to  IR+1,  the  number  of  OCIV's 
given  as  input,  plus  1. 

-  a  variable  which  indicates  step  size  in  the  looping  used  to 
read  the  data  input . 

-  a  variable  which  is  used  by  subroutine  RDIT  as  the  number  of 
data  fields  per  record  and  by  subroutine  BIV0R  to  indicate  to 
subroutine  CASSR  that  CASSR  is  being  called  from  BIV0R. 

-  the  total  number  of  data  points  (<=  n  in  previous  chapters)  . 

•  a  variable  which  indicates  when  the  data  termination  card  has 
been  read. 

•  a  variable  which  is  used  to  control  page  headings  in 
subroutine  CMFR. 

•  the  number  of  independent  variables  present  in  the  model  at 
any  step. 

-  the  number  of  independent  variables  present  in  the  model  (at 
any  step),  plus  2. 
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NNL 

NNN 

NNNSAV 

NNSAV 

NNXA 

N0BS 

NPED 

RE  CM 
RSSM0 


S 

SDEV 

SELECT 

X 

XD 

YSDEV 

YY 


-  a  variable  which  is  used  to  index  the  last  row  and/or  last 
column  of  the  summation  matrix  S  containing  the  constants, 

EVr,  of  the  normal  equations. 

-  the  rank  of  the  matrix  of  the  normal  equations  at  any  step. 

-  a  variable  which  saves  the  rank  of  the  matrix  of  the  normal 
equations  for  the  main  run. 

-  a  variable  which  saves  the  main  run  value  of  the  variable  NN. 

-  a  variable  equal  to  the  main  run  value  of  the  variable  NNN. 

-  this  variable  (EQUIVALENCED  to  ,IDG0  in  subroutines  ABT, 

IDENTM,  and  PRINTM)  is  used  to  indicate  the  acceptance  or 
rejection  of  the  identity  matrix. 

-  a  variable  which  controls  the  predicted  value  and  Chi-square 
computations . 

-  the  reciprocal  of  the  number  of  observations  M. 

-  this  variable  value  equals  the  main  run  regression  sum  of 
squares  adjusted  for  the  mean.  If  the  main  run  does  not 
pass  the  four  checks  on  the  determinant  of  A,  R*" ,  sa,  and 

the  cvv  (see  paragraphs  B,  D,  E,  and  F  of  Section  VI. 2. a. (2)), 
this  value  is  negative  indicating  that  no  final  comprehensive 
is  to  be  printed. 

-  the  summation  matrix;  the  first  N+l  rows  and  N+l  columns 
represent  the  matrix  of  the  normal  equations;  the  (N+2)t*1 
row  and  column  are  the  constants,  EVy  (v=0,l,...,N),  and 
Eyy,  of  the  normal  equations. 

-  the  square  root  of  the  residual  variance. 

-  this  variable  Indicates  whether  a  rerun  is  a  hand  selected 
rerun,  an  IV0R  rerun,  or  a  BIV0R  rerun,  for  printout  purposes. 

-  an  array  which  contains  the  coordinates  for  each  data  point. 

•  an  array  which  is  used  in  subroutine  PREVAft  to  contain  the 
coordinates  of  the  selected  input  or  synthetic  design  points 
adjusted  for  the  Averages  of  the  corresponding  input  coordinates. 

•  an  array  containing  the  prediction  standard  deviations. 

•  an  array  containing  the  predicted  values. 
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IAPE 

INDX 

K0UNT 

MMM 

NSAV 

XIT 

XYIT 


ASSR 

ATSS 

CHI 

CHISUM 

CMPFR 

C0R 

C0RSQ 

♦Note  - 


VIII. l.b  Variables  in  the  MAIN  PROGRAM 


-  the  actual  logical  tape  number  of  the  tape  containing  the 
coordinates  of  the  data  points. 

-  a  variable  which  is  set  equal  to  IR+1.  The  coordinates  of 

the  first  IR  independent  variables ,  modified  by  the  independent 
variable  selection,  are  printed  to  identify  the  selected  input 
and/or  synthetic  design  points. 

-  counter  for  the  selected  input  and/or  synthetic  design  points. 

-  an  index  used  in  the  coding  to  reverse  the  order  of  input 
items  in  the  L0T  array. 

-  saves  the  main  run  value  of  N. 

-  the  time  which  is  computed  by  the  various  timing  subroutines. 

-  used  only  as  a  required  argument  to  the  E0F  function. 


VIII. l.c  Variables  in  Program  Subroutines 
(1)  Variables  in  Subroutine  ART* 

-  the  regression  sum  of  squares  adjusted  for  the  mean. 

-  the  total  sum  of  squares  adjusted  for  the  mean. 

-  an  array  whose  jfc^  element  contains  a  contribution  to  the 
Chi-square  statistic  if  the  jfc^  interval  is  the  last  of  a 
group  of  intervals  having  a  total  of  more  than  5  expected 
prediction  errors.  Otherwise  CHI(J)  =  -1.0. 

-  The  Chi-square  statistic. 

-  an  array  whose  jth  element  contains  summed  expected  prediction 
errors  if  the  j^h  interval  was  the  last  of  a  group  of  intervals 
having  a  total  of  more  than  5  expected  prediction  errors. 
Otherwise,  the  contents  of  CMPFR(J)  are  meaningless. 

•  the  correlation  coefficient. 

•  the  square  of  the  correlation  coefficient  (i.e.,  the  coefficient 
of  determination) . 


T.  Herring,  who  coded  the  program  DA-MRCA,  named  this  subroutine 
for  a  co-author  of  the  report. 
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EDELTA 
E RANGE 
ESS? 
ESTEP 

ES2 

EYYL 

EYYU 

FGRAPH 

FN 

F0UT 

IDF 

IFGRPH 

IFREQ 

I0BF 

IXMAX 

IXMIN 

NRO 

NR1 

NR2 


•  the  interval  size  in  the  Chi-square  computations. 

-  the  range  of  the  prediction  errors. 

-  the  main  run  value  of  the  error  sum  of  squares. 

-  an  array  which  contains  the  upper  bounds  of  the  30  intervals, 
into  which  the  range  of  the  prediction  errors  is  divided. 

-  the  sum  of  squares  of  the  prediction  errors;  the  check 
error  sum  of  squares. 

-  the  minimum  prediction  error. 

-  the  maximum  prediction  error. 

-  an  array  which  contains  the  symbols  for  the  prediction  error 
frequency  distribution  bar  chart. 

-  a  floating  point  representation  of  the  rank  of  the  matrix 
of  the  normal  equations. 

-  the  F  ratio  for  regression  on  deleted  variables. 

-  the  degrees  of  freedom  of  Chi-square. 

-  the  number  of  symbols  which  are  to  be  printed  on  a  line  in 
the  prediction  error  frequency  distribution  bar  chart. 

-  an  array  which  contains  the  frequencies  of  occurrence  of 
prediction  errors  in  the  intervals  delimited  by  the  ESTEP 
array . 

-  an  array  whose  jth  element  contains  the  summed  observed 
frequencies  of  prediction  errors  for  a  group  of  intervals 
if  the  jth  interval  was  the  last  of  a  group  of  intervals 
containing  a  total  of  more  than  5  expected  prediction  errors. 
Otherwise,  the  contents  of  I0BF(J)  are  meaningless. 

-  the  element  number  of  the  maximum  prediction  error. 

-  the  element  number  of  the  minimum  prediction  error. 

-  the  number  of  independent  variables  for  the  main  run. 

-  the  number  of  data  points  minus  the  main  run  value  of  NNN, 

i.e.,  the  degrees  of  freedom  of  the  error  variance. 

•  the  number  of  Independent  variables  which  have  been  deleted 
from  the  model. 
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SDEVSQ 

SSE 

SSR 

xrr 

AMAX 

AMIN 

IDUM 

IMAX 

ISEE 

ISTART 

IXMAX 

IXMIN 

JL0T 

JSAVE 

KASSR 
K G0 

LAT 

N0BS 


-  the  residual  variance. 

-  the  residual,  or  error,  sum  of  squares. 

-  the  unadjusted  regression  sum  of  squares. 

-  the  time,  in  seconds,  for  the  execution  of  subroutine  GAUSS. 

(2)  Variables  in  Subroutine  BIV0R 

-  the  maximum  ASSR  value. 

-  the  minimum  ASSR  value. 

-  a  dummy  argument  to  subroutine  CASSR. 

-  the  L0T  array  index  of  that  independent  variable  which  is  to 
be  deleted  from  the  model. 

-  a  variable  value  which  ensures  that  the  identity  matrix  will 
be  checked  only  until  an  inverse  is  found  whose  associated 
identity  matrix  element  deviations  are  all  smaller  than  1(1) . 

«  a  variable  value  which  is  used  to  define  the  L0T  array  index 
of  the  leftmost  independent  variable  of  a  group  of  independent 
variables . 

-  the  index  of  the  maximum  ASSR  value  in  the  AW  array. 

-  the  index  of  the  minimum  ASSR  value  in  the  AW  array. 

-  a  variable  used  to  index  the  regression  coefficients  and 
inverse  matrix  diagonal  elements  tfhich  are  due  to  independent 
variables  for  which  ASSR  values  are  to  be  computed. 

-  the  index  of  the  L0T  array  element  which  element  is  to  be  set 
equal  to  1  if  the  matrix  inversion  is  not  accepted. 

-  a  counter  of  the  ASSR  values  which  are  computed  at  each  step. 

•  a  variable  which  indicates  the  failure  of  the  matrix  inversion 
in  subroutine  CASSR. 

-  an  array  which  holds  the  L0T  array  indices  of  the  Independent 
variables  for  which  ASSR  values  are  computed. 

-  a  variable  which  indicates  whether  or  not  a  matrix  inversion 
is  the  first  accepted  inversion  in  subroutine  BIV0R, 
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NQQ 

ASSR 

ATSS 

C0RSQ 

FNNN 

SDEVSQ 

SSE 

SSR 

F? 

F0C 

F0MRE 

F0T 

JJ 

K0UNT 

PR0BN 


-  a  variable  value  equal  to  the  number  of  independent  variables 
in  a  group  in  the  grouping  feature  for  independent  variables. 


(3)  Variables  in  Subroutine  CASSR 

-  the  regression  sum  of  squares  adjusted  for  the  mean. 

-  the  total  sum  of  squares  adjusted  for  the  mean. 

-  the  square  of  the  correlation  coefficient. 

-  a  floating  point  representation  of  the  rank  of  the  matrix  of 
the  normal  equations. 

-  the  residual  variance. 

-  the  residual,  or  error,  sum  of  squares. 

-  the  unadjusted  regression  sum  of  squares. 

(4)  Variables  in  Subroutine  CHISQ 

-  the  actual  number  of  prediction  errors  in  a  group  of  intervals 
in  the  search  for  a  group  of  intervals  having  more  than  5 
expected  prediction  errors. 

-  the  computed  (expected)  number  of  prediction  errors  in  a  group 
of  intervals  in  the  search  for  a  group  of  intervals  having 
more  than  5  expected  prediction  errors. 

-  the  program  looks  ahead  each  time  it  finds  a  group  of  intervals 
having  more  than  5  expected  prediction  errors  to  determine 
whether  or  not  more  than  5  expected  prediction  errors  remain; 
if  not,  then  the  remaining  frequencies  are  associated  with 

the  preceding  group  and  F0MRE  is  the  resulting  difference 
between  the  observed  frequency  and  the  expected  frequency. 

•  the  total  number  of  (observed)  prediction  errors  which  have 
contributed  to  the  Chi-square  statistic. 

-  the  interval  index  of  the  interval  which  was  the  last  of  a 
group  of  intervals  containing  more  than  5  expected  prediction 
errors. 

-  a  variable  which  counts  the  number  of  groups  of  Intervals 
having  more  than  5  expected  prediction  errors. 

-  the  area  under  the  normal  frequency  function  from  -®  to  the 
upper  bound  of  any  of  the  various  intervals  into  which  the 
range  of  prediction  errors  is  divided. 


150 


NWL  REPORT  NO.  2035 


PR0B0  -  the  area  under  the  normal  frequency  function  from  -®  to  the 
upper  bound  of  the  last  interval  which  was  the  last  of  a 
group  of  intervals  containing  more  than  5  expected  prediction 
errors , 

REMAIN  -  the  remaining  number  of  expected  prediction  errors. 

(5)  Variables  in  Subroutine  CMPR 

AN  -  a  floating  point  representation  of  the  total  number  of  data 

points . 

ESQU0T  -  the  error  sum  of  squares . 

FQU0T  -  the  F  value  for  regression  in  the  analysis  of  variance  tables 

IRCT  -  the  number  of  independent  variables  in  the  present  model, 

plus  1. 

IW  -  an  index  for  elements  in  arrays  which  elements  are  used  to 

define  the  variable  output  formats. 

K  -  the  number  of  independent  variables  in  the  present  model, 

plus  2. 

LAST  -  a  variable  used  in  the  computation  of  index  values  for  arrays 
which  are  used  to  complete  the  definition  of  the  variable 
formats  for  the  printing  of  the  regression  equation. 

LL  -  a  variable  used  to  control  the  printing  of  page  headings. 

N0MR  -  an  integer  representation  of  the  error  degrees  of  freedom. 

0MR  -  the  error  degrees  of  freedom. 

R  -a  floating  point  representation  of  the  number  of  independent 

variables  in  the  model;  the  degrees  of  freedom  for  regression 

RSQU0T  -  the  mean  square  for  regression. 

(6)  Variables  in  Subroutine  FIX 

LIT  -  an  array  which  contains  a  BCD  representation  of  the  first 

NNNSAV  (see  Section  VIII. 1. a)  elements  of  the  L0T  array  and 
BCD  seroes  for  the  remaining  elements. 


151 


NWL  REPORT  NO.  2035 


(7)  Variables  In  Subroutine  GAUSS 

AMAX  -  the  maximum  element,  of  those  elements  searched.  In  the 

matrix  of  the  normal  equations  at  each  step  of  the  inversion 
process. 

IC0LUM  -  the  column  number  of  the  maximum  of  those  elements  in 
unpivoted  rows. 

IG0  -  a  variable  which  indicates  when  no  pivot  element  could  be 
found  at  a  step  of  the  inversion  process. 

INDEX  -  an  array  containing  the  row  and  column  numbers  of  those 
elements  which  are  used  as  pivot  elements. 

IPIV0T  -  an  array  which  indicates  those  rows  of  A  which  have  served 
as  pivot  rows. 

IR0W  -  the  row  number  of  the  pivot  element. 

PIV0T  *  a  variable  set  equal  to  the  value  of  the  pivot  element. 

SWAP  -  a  temporary  storage  location  used  to  interchange  rows  and 

columns. 

T  -a  variable  which  is  equal  to  the  successive  elements  of  the 

A  matrix  which  are  in  the  same  column  as  the  pivot  element. 

(8)  Variables  in  Subroutine  IDKNTM 

AIDBNT  -  the  Identity  matrix. 

SUM  -  a  variable  which  Is  used  to  compute  the  Individual  elements 
of  the  Identity  matrix. 

(9)  iniafelMLJa MmUm  fife 

AMAX  •  the  maximum  ASSR  value. 

AMIN  •  the  minimum  ASSR  value. 

XG02  -  a  variable  which  Indicates  the  case  of  a  perfect  fit. 

IMAX  -  the  L#T  array  Index  of  that  Independent  variable  which  Is  to 
be  Included  In  the  model. 

XSTART  •  a  variable  which  Is  used  to  define  the  L0T  array  index  of 
the  leftmost  independent  variable  of  a  group  of  Independent 
variables. 


152 


NWL  REPORT  NO.  2035 


IXMAX 
IXMIN 
KASSR 
K G0 

K0UNT 

LAT 

NUM 


T0LSS 


-  the  index  of  the  maximum  ASSR  value  in  the  AW  array, 

-  the  index  of  the  minimum  ASSR  value  in  the  AW  array. 

-  a  counter  of  the  ASSR  values  which  are  computed  at  each  step. 

-  a  variable  which  indicates  that  a  non-valid  ASSR  value  was 

computed  by  the  CASSR  subroutine. 

-  a  counter  of  the  number  of  independent  variables  which  have 
been  "actively"  ordered,  i.e.,  ordered  in  a  group  of  IV's  as 
long  as  there  is  more  than  one  IV  left  in  the  group. 

-  an  array  which  holds  the  L0T  array  indices  of  the  independent 
variables  for  which  ASSR  values  are  computed. 

-  a  variable  which  is  used  to  determine  when  to  cease  "actively" 
ordering  independent  variables  in  a  specified  group  of 
independent  variables,  i.e.,  when  there  is  only  one  independent 
variable  left  in  the  group. 

-  a  tolerance  which  is  used  to  establish  equality  of  ASSR 
values  and  hence  the  perfect  fit. 

(10)  Variables  in  Subroutine  MAXMIN 


The  variables  used  by  this  subroutine  have  been 
amply  defined  by  any  one  of  its  calling  subroutines,  and,  therefore, 
these  variables  will  nut  be  further  defined  here. 


(11)  Variables  in  Subroutine  PREVAR 

JJJ  -  an  index  which  is  used  to  delete  Independent  variables  from 
the  X  array. 

N  N 

TWiXX  equals  £  ^cvv'(xV(ts  «*»)(xvt,}  -  x*» ) 

ml  v'«l 

which  is  used  in  the  computation  of  prediction  standard 
deviations. 

(12)  Variables  in  Subroutine  PRtKIH 
A IDS NT  •  the  identity  matrix. 

(13)  Variables  In  Subroutine  RPISK 

ISTART  -  a  variable  which  is  always  1  more  then  the  number  of  records 
reed  from  tape  or  disk  Logical  unit  10. 
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IWHICH  -  the  number  (record  number)  of  a  data  point  which  is  to  be 

used  as  a  selected  input  design  point  in  prediction  standard 
deviation  calculations. 

NUMBER  -  the  number  of  records  which  must  be  read  in  order  to  position 
the  storage  device  so  that  the  IWHICHth  data  point  can  be 
read  with  the  next  READ  statement. 

SKIP  -  a  variable  which  is  used  to  skip  records. 

(14)  Variables  in  Subroutine  RDIT 

INDEX  -  the  number  of  an  independent  variable  which  is  to  be  used  as 
a  factor  in  a  product  term,  plus  1. 

J1  -  the  Y  array  index  of  the  last  variable  on  each  card  of  input. 

KK  -  the  Y  array  index  of  a  product  term. 

MZ  -  if  a  data  point  requires  more  than  one  card  or  record  to 

contain  the  coordinates  of  the  OCIV's,  then  MZ  is  used  as 

a  dummy  variable  in  reading  those  cards  or  records  after 
the  first  card  or  record. 

Y  <*an  array  which  contains  the  coordinates  of  the  dependent 

variable  and  those  of  the  OCIV's  as  they  are  read  and  which 
later  contains  also  the  coordinates  of  the  GCIV's. 

(15)  Variables  in  Subroutine  REDUCM 

These  variables  are  described  in  Section  VIII. 1. a 

and,  therefore,  will  not  be  further  described  here. 


1 54 


NWL  REPORT  NO.  2035 


VIII. 2  Flow  Charts 

a  DA-MRCA  SUBROUTINE  FLOW  CHART 


iss 


b.  MAIN  PROGRAM 


NWL  REPORT  NO.  2035 


START 


3 


1 


REWIND  TAPPS 


CLOCK  I  SET  IV  iDEmt* 
JlFlCATION  Vo  mANO"  PR^NT 
PROGRAM  IDENTIFICATION. 


p*og*am  ENDS  ,F  an  eoF 
CARO  IS  PRESENT  INSTEAD 

the  problem  identific 

lATIQN  CARD 


'IDENTIFICAT- 

'rniurort!0  *EAD  PROBLEM 

[Control  card  compute 
total  number  of  INDEPEN¬ 
DENT-VARIABLES  *  N 


SET  NPED  Tn  i 


loo,  many  or  too  FEvTrvtg 


ye* 


PRINT  "CARD  TYPE  2  is 

(incorrect 


3 


n'sr 

_L 


WeTur?T 


IF  NFD  =  0,  MAKE  UP  PROPER 
FORMAT  FOR  READING  DATA 
INPUT  CARDS 


I9IDS  =  I  ADD  I  TO  THE 
INPUT  VALUES  OF  NR,  IBIC 
Land  IVORGO  ISKIP  t  ) 


RSSMO  t-S.O 


no 


IS  =  O’ 


yes 


READ  PROOUCT  TERM  0£: — I 
SCRIPTION  CARDS  ADD  I  TO 
SACH. VALUE  READ  “ 


IVORGO  AS'TbaSIS" 
POR  ACTION,  READ  THE 
IVOR  AND/OR  BIVOR  , 
.control  CARDS  if  present! 


("F  PREDICTED  VALUES  ANO 
(standard  DEVIATIONS  ARE 
lOESlRED  FOR  SELECTED 
INPUT  DESIGN  POINTS  READ 
SELECTED  INPUT  DESIGN' 
POINT  CARD(S)  OTHERWISE 
(CONTINUE. 

[  ~ 

Fjwrpicrgo  vAluk  Trmn 
(SJ^oard  deviations  are  ' 
SI!'5E0  *on  SYNTHfT'3 
OESIGN  POINTS,  REWIND 
y?*T  I*.  CALL  ROIT  TO 
READ  SYNTHETIC  DESIGN 
POINT  CARO®  ANO  TO 
[COMPUTE  PROOUCT  TERMS 
>  NEEDED  STORE  THE 
RESULTING  SYNTHETIC 

P0,NTS  OH  DISK 
[UNIT  II,  OTHERWISE  cqn't 


l 


rma  mr 

5S?riI  fc  ,U,"“TI0» 


T 
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i 


REWIND  DISK  UNIT  10 

T 


CALL  RDIT  TO  READ  DATA 
INPUT  CAROS  (OR  DATA 
TERMINATION  CARD)  AND  T0( 
COMPUTE  PRODUCT  TERMS. 
RDIT  ALSO  COUNTS  THE 
NUMBER  OF  DATA  POINTS. 


WAS  THE  LAST  cAfcb  A 
DATA  TERMINATION  CARD  ? 


[ 


no 


TOO  FEW  OR  TOO  MANY 
OATA  POINTS? 

- pj— 


"ref 


STORE  THIS  OATA  POINT  ON| 
DISK  JO _ 


[5  is  of  gggNgoN  gjgTTNgj 


PRINT  TOO  FEW  OR  TOO 
MANY  DATA  POINTS" 


STORE  1st  (N  +  l)  ROWS  AND 
COLUMNS  OF  THE  SUMMATION 
MATRIX  INTO  A  AND  AKP. 


PRINT  (OR  NOT  PRINT)  THIS 
DATA  POINT  AS  INDICATED  BY 
THE  INPUT  VALUE  NDPQ. 


RETURN 


PRINT  THE  SUMMATION 
MATRIX. 


STORE  THE  lot  7N+TJ 
ELEMENTS  OF  THE  (N+2)tR 
COLUMN  OF  THE  SUMMATION 
MATRIX  WTO  ARRAYS  B  AND  BB.l 


INCLUDE  THIS  DATA  POINT  INTO 
THE  SUMMATION  MATRIX  S. 


NR  =  NR -I 


IF  THIS  IS  THE  MAIN  RUN 
AND  BIVOR  WILL  BE  CALLCD| 
AND  IF  IBID  =  2,  THEN  SET 
IBIOS  =  S  SO  THAT  SUB¬ 
ROUTINE  ABT  CAN  DIVE  THE 
PRINTOUT  NO  IDENTITY 
MATRIX  CHECKS  WILL  BE 
MADE;,  ON  SUBSEQUENT  BIVON 


~ r 

CALL  SUBROUTINE  AtJt  I 

#  lAlOS  *  3, 
IBIDS  *  1 

THEN  SEt  ~j 

WAS  AN  ERR6 
SUBROUTINE  A 

k  FOUND  6V 
BT’ 

no 


y*» 


I  "  it  f HIS  A  m'Ain  run 


IS  VhiS  a  main  rtuh }  I 


no 


»•» 


[Ml 


r 


[MMPuTE  AUS  WirW' 

AVERAGES  OF  THE  INOEPEN<| 
DENT  VARIABLES  AND 
DEPENDENT  VARIABLE. 

|w gy  An  IlEM^T  6f  ThT 

MAIN  DIAOONAL  OF 

A‘‘A>  T0LI2  ? _ 

Tno 


®0  © 


© 


<$> 
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c.  SUBROUTINE  AST 
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d.  SUBROUTINE  IVOR 
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e.  SUBROUTINE  BIV0R 


STAPT 


nobs  *  i 


SAVE  THE  INPUT  VALUE  OP 
NPED  INTO  NTAPE  AND  SET 
NPED  =  I  SO  THAT  CHl-SOUARE 
AND  PREDICTED  VALUES  WILL 
ALWAYS  BE  DONE  FOR  THE 
FIRST  ACCEPTED  RUN  OP  BTVOR. 


SET  KNUMs-i  TO  INDICATE  TO 
THE  CASSR  SUBROUTINE  THAT 
CASSR  IS  BEING  CALLED  BY  THE 

bivop  subroutine. 


SET  M4  TO  ZERO  ASSIGN  A 
STATEMENT  NUMBER  TO“lSEE" 
WHICH  WILL  CAUSE  BTVOR  TO 
CONTINUE  SEARCHING  FOR  AN 
|  ACCEPTS  0  M  VERSON.  ISCE  W»l 


WAS  THE  MAIN  RUN  ACCEPTED; 

rs*i®  ai? 

"O  j  vf| 

PRINT  HEADING  FOR  the  BIVOr""") 
final  COMPRChENST'C  ON  BCD  I 
TAPE  13. 


SET  -HE  ENTIRE  LOT  ARRAY 

EQUAL  to  zero 


COMPUTE  the  TOTAL  number  op 
INOCPENOENT  variables  WMifH 

T»  RC  OPQOPro  ACCCROIM5 
TO  The  imvt  VALUES  op  tht  e»st 
MB  '  VAtJUF'OF  r*«:  NO  A»RA» 


OELCTf  ANY  RPtMMRM) 
«R«*LtS  FROM  THE  MODEL  BY 

Srr*NG  THE  CORRESPOND  V. 

V»U'rT  •»!  TV*-  LOT  ARRAY  »  | 


1*0 

zr~i 

i*i*i 

T~ 
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DOES  K*  THE  NUMBER  OF 
WOE PENDENT  VARIABLES 
IN  THE  Xth.  GROUP  ? 


no 


r*i 


X  *  MB  ?  I 

|  «o  J  ft* 

RETURN  j 


169 


NWL  REPORT  NO.  2035 


VIII .3  Programming  and  Conversion  Notes 

a.  Language  -  DA-MRCA  is  coded  for  the  IBM  7030  computer 
(STRETCH)  entirely  in  FORTRAN  IV.  FORMAT  and  DATA  statements  assume 
eight  characters  per  word. 

b.  INPUT -OUTPUT  Requirements  -  Three  BCD  tapes  are  required 
in  addition  to  the  system  printer  output  tape.  These  BCD  tapes  have 
logical  unit  numbers  of  5,  9,  and  13*  where  5  is  the  number  for  the 
tape  unit  containing  the  coordinates  of  the  OCIV's  and  of  the  dependent 
variable  when  this  data  is  on  a  separate  tape;  9  is  the  number  for  the 
tape  unit  containing  the  analysis  of  variance  tables  which  are  computed 
in  the  program;  and  13  is  the  number  for  the  tape  unit  which  contains 
the  final  comprehensive  analysis  table. 

Two  disk  (or  binary  tape)  logical  units  are  required. 

Disk  logical  unit  10  is  used  to  store  the  coordinates  of  the  data 
points,  and  disk  logical  unit  11  is  used  to  store  the  coordinates 
of  the  OCIV's  for  the  synthetic  design  points. 

The  input-output  requirements  are  described  for  the 
STRETCH  in  the  I0D  subprogram.  The  program  listing  contains  a  listing 
of  this  subprogram. 

c.  Storage  Requirements  -  C0MM0N  storage  requires  25461 
locations.  The  subprograms,  excluding  library  functions  and  subroutines, 
require  4511  locations  on  the  STRETCH  but  may  require  more  or  less  on 
other  machines . 

d .  Library  Subroutines  and  Built-in  Functions  - 

ABS  -  the  absolute  value  function. 

E0F  -  returns  a  value  of  .TRUE,  if  an  end  of  file  has  been  read, 

.FALSE,  otherwise. 

FL0AT  -  converts  an  integer  to  a  floating  point  number. 

FREQ(T)  -  the  normal  distribution  function  which  gives 


T 

J  exp  [-y3/2]  dy. 


INTVL  -  measures  the  interval,  in  seconds,  between  the  current  entry 
into  INTVL  and  the  exit  from  the  immediately  preceding  TIME, 
INTVL,  or  SETIT  subroutine, 

Kl£K  -  the  time  in  hours/minutes/seconds  since  the  last  CALL  SETCLK. 
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MIN0  -  chooses  the  smallest  of  its  fixed  point  arguments. 

SETCLK  -  used  at  the  beginning  of  a  portion  of  a  program  to  be  timed 
by  the  KL0K  subroutine. 

SETE0F  -  this  function  is  necessary  in  order  to  use  E0F;  it  causes 

E0F  to  be  set  to  .TRUE,  and  termination  of  execution  of  the 
READ  statement  when  an  end  of  file  has  been  reached. 

SETIT  -  see  INTVL  and  TIME  . 

TIME  -  measures  the  usable  elapsed  time,  in  seconds,  between  the 
exit  from  SETIT  and  the  current  entry  into  TIME. 
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VIII. 4  Program  Listing 


SUB  TYPE  *  F I 00 
2 ! 00  «  »RE AOCR 

3100. SPRINTER 

SO  1 00 • T APE  «  «  «  . EVEN «  «  SA V£ 

REEL* PUL 

90  1 00* TAPE*  «  «  «ECC * • 

REEL  *NULS9 
JOIOO.OISK...TO* 

II 100. 01 SK.** 100 
130  1 00 • T  APE  *  *  *  *  ECC  «  « 

REEL  *NULS1 3 
ENO 

subtvpe»fortran*lmap»pbin  nrcaoooo 

DA-MRCA  MULTIPLE  REGRESSION  COMPREHENSI VE  ANALYSIS  MRCAOOIO 

COMMON  A(3t.Sl  I.BS0EVC31  }  .6 ( 260 1 > » YY (7000 1 «* (32 ) *X0 (51  >  MPCA0020 

CONMON  AW(32).YSOEV|7000).AW(3l  >  «RECM«  NOR*MVPL  *NNNSAV«NNN.L0T  (31  )MRCA0030 
COMMON  NNL . 0ETERM.N0B5.T0L  I  I .T0LI2  *ERROR*NPED. ITOTAL*N»NOPO» ICA5EMRCA00*0 


COMMON  RSSMO* ISKIP*NJ(2S)*M4«P|RM(7)«KNUM*KMUM*MB«MI .NO (25 ) • 10 
COMMON  NNXA.NNSAV.SDEV.AKPI  SI .St  » .68 <32  J . S <52 .82 ) .PGL3 ( t 0  I 
COMMON  IN<«9.10).IR.IS.M1 .JLIM.NN.M.  TAPE 
COMMON  SELECT. I8I0.IBI0S 

01 MENS  I  ON  EYY(7000t.LET(tS) * IKEEPR (999 > .FORM (3 ) 

EQUIVALENCE (B ( 1602 ) . IKEEPR ( 1 ) t • I YSOEV.EYV > « { I  CASE. I  RUN) 
EQUIVALENCE (LOT (1  I.LETU  )).(FORM(|  )«FlRM(2>) 

EOUI VALENCE (NNXA.NEN.L1M ) 

INTEGER  TAPE 
LOGICAL  EOF 

DATA  LIMOB(7000).F|RMF(BH(I2.  | .SEVEN(8M7F|0.4)  I  *  CPAREN ( l H I ) 
OATA  HANOSISH (HANOI  ). IVORS C«M( IVOR)  ) .61 VORS (8H <B I VOR )  ) 


CALL  SETEOF  MRCA0230 

REWIND  S  MRCA02A0 

HEW  I NO  9  MRCA02S0 

REWINO  13  MRCA0260 

F|RMII)*F|RMF  MRCA0270 

540 1  CALL  setclk  mrcaoebo 

CALL  SET  IT  MRCA0290 

SElEC  t .hands  MRCAOJOO 

M4a  0  MRCA03I0 

PRINT  2004  MRCA03E0 

2064  FORMAT (4PH2DA*MRC A  ...  OUTPUT  FROM  PROGRAM  VERSION  (/  1/64)  MRCA0330 

RE AO  S9S.PGLB  MRCA0340 

3 95  FORMAT ( I OAB I  MRCA0350 

PRINT  394.PGLB  MRCA0360 

RE AO  S79.|R.IS.NR.MVP.N0R.MVPL«NPE.N0PO.rAPEtlV0R6O*A'0«l8IO.T0LtlMRCA0370 

I. T0LI2.F0RM  MRCAO30O 

379  FORMAT (212*313*911*12*11 • IX*2£9*3*3A8 )  MRCA0390 

PRINT  973  MRCA0400 

973  FORMAT! | IBMOIR  IS  NR  MVP  NOR  MVPL  NPE  NOPO  TAPE  IVOR GO  NFD  (BIO  MRCA04I0 

I  TOLII  T0LI2  FORM  -  INPUT  OATA  DESCRIPTION  -CARO  TYPE  2)  MRCAO*20 

POINT  976. !0.!S.NR,NVP.N0R«MVPL.NPE«N0P0. TAPE* IVOROO.NFO.ieiO.TOLtMRCAOAJO 

II . T0LI2.F0RM  MRCA0440 

976  FORMAT (I HO. It* IX* 12* 3( IX* 13) *2 (3X* 1 1 )•* I4X* 1 1) «SX. 1 1 *4X* I t*3X* 1 1 • |MRCA0«30 

IX.2llX.E9. 3 ) • tX.SAB )  MRCA0«60 

Na I R* | S  MRCA0470 


MRCA0050 

MRCA0060 

MRCA007C 

MRCAOOeO 

MRCA009C 

MRCAOIOO 

MRCAOI 10 

MRCA0120 

MRCAOI 30 

MRCAOI *0 

MRCAOI SC 

MRCAOI 60 

MRCAOI 70 

MBCAOieC 

MRCA0I90 

MRCA0200 

MRCA02I0 

MRCA0220 

MRCA0230 

MRCA02A0 

MRCA02S0 

MRCA0260 

MRCA0270 

MRCAOtBO 

MRCA0290 

MRCAOJOO 

MRCA03I0 

MRCA0320 

MRCA0330 

MRCA0340 

MRCA0350 

MRCA0360 
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NPE0-1 

NNN«N+1 

|TOTAL«NNN 

IFMN-I  | * (N—50 ) } I  1 07* 1 1 07* I  07 
I  1 07  nsav«n 

1 F ( NF 0 ) 2999 . 2999 * 3000 

2999  Ft  PM  (2  ) -SEVEN 
KNUM«7 
KMuM*6 

GO  TO  3001 

3000  FI  PM (7 )»CPAP£N 

KNUM«NFD 

KMuM«NFO-l 

3001  1CASE«0 
NRsNP*  1 
INDX«IB41 

! B 1 05* 1 
IP  10=  IB  1D*1 
l VOPGO= I VOBGO* 1 
iF(TAPE>a.io.e 

10  I  APE  *2 
GO  TO  1 l 

e  iape«5 

11  1S<|P»I 
TAPE«2 
PSSM0--3.0 
NNNSAVsO 

IF( IS1I3. 13. 12 

13  JL I m» i Q* i 
L1M.JLI* 

NN«t |M4| 

GO  TO  14 

12  BE*0  92. « (ISIK.L ) .L»l ♦ 10 ) 1 » IS) 

92  F OPM A T 14  012) 

PRINT  29. ( I |NIK»L)*L*»i .10) .K* 1 « IS) 

29  FORMAT  I 39hC»R00UCT  TERR  DESCRIPTIONS 
I I0I3.1H/. 1013.1 H/.I0I3. IH/ ) ) 

JL|M« | R  4 1 
UR«JLIM4IS 

NN>LIM4| 

DO  20  *»I«IS 
00  20  t»l ♦ 10 

20  INtK.L )■ IN1K.L )4| 

14  GO  T0<2 I .22.23*22 ) . I VORGO 
C  BEAD  IVOR  GROUPING  VALUES 

22  RE AO  100.I0»M|.(NJ(| >.!•!. MJ) 

1 00  FORMAT) 12*2613) 

PRINT  I , IO.M| . <NJf l)*l>l«MI) 

1  FORMAT (4IH0IO  M|  MJ( I i. |«| .2.....MI 
1*23(3) 

GO  TO <22 1 *21 *221 *23) •  I VORGO 
C  RE AO  0 IVOR  GROUPING  VALUES 

23  RE AO  I00.M8.IL0TII |.t«l*M8) 

00  OR  l>l *M8 

MMMsMO- I 4 1 
99  NO ( I I-lOT(MMM) 

PRINT  1 0 1 . M8. (LOT  1 1 ) « I *1 *M8 ) 

101  FORMAT  (4IM0M8  LOT <!> *  I • I *2* • • • * M8 

21  |F (NOR ) 26 «26**0 

40  READ  4 1  *  < IKEEPRC I ) • !•! *NOR) 


MRCA0480 

MRCA04Q0 

MBCA0500 

MRCA0510 

MRCA0520 

MRCA0530 

MRCA0S40 

MRCA0530 

MRCA0560 

MRCA0570 

MRCA0S80 

MRCA0590 

MRCA0600 

MRCA0610 

MRCA0620 

MRCA0630 

MRCA0640 

MRCA0630 

MRCA0660 

MRCA0670 

MRCA0680 
MPCA0690 
MBC *0700 
MRCA071 0 
MRCA0720 
MRCA0730 
NRCA0740 
MRCA0750 
MRCA0760 
MRCA0770 
MRCA0780 
MRCA0790 
MRCAC800 
MRCAOBIO 
MRCA0820 

CARO  TYPE  3/IM  ✓  <  IX. 1013* IH/MRCA0830 

MRCA0840 
MRC.A08S0 
MBC A 0860 
MRCA0870 
MRCA08B0 
M/ACA0890 
FRCA0900 
HRCAO9I0 
MRCA0920 
MRCA0930 
MRCA0940 
MRCA0950 

-CARO  TYPE  4/lV. I2»2X. I2.2KMRCA0960 

MRCA0970 

MRCA0980 

MRCA0990 

MRCAJOOO 

MRCA10I0 

MRCAI020 

MRCAI030 

MRCA1040 

•CARO  TYPE  9/1 X* I2»2K«2SI3 )MRCA1050 

MRCA I860 
MRCA1070 
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c 


41 

FORMAT (20 

14) 

PRINT  411 

. I IKEEPR ( 

411 

FORMAT 153H0NUMBERS 

1015) I 

26 

IF ( MVP )25 

.25.466 

466 

REWIND  11 

DO  79  K  *  1 

.MVP 

READ  I N  POINTS  FO R  VAR 
CALL  roit 

T9  WRITE!  U » IXII >• 1*2 
,75  PRINT  594.PGLB 
S94  FORMAT ( 1  HI • tOAS) 


MRCAl 080 

I  )  « I  *  1 «  NOR )  MRCAl 090 

OF  SELECTED  INPUT  DESIGN  POINTS  -CARD  TYPE  6/<2MRCAll00 

MRCAIIIO 
MRCAl 120 
MRCAl 130 
MRCAl 140 

IANCE  OF  PREDICT  I  ON. COMPUTE  PRODUCT  TERMS.  MRCAl 150 

MRCAl 160 

»  LIM)  MRC A  1170 

MRCAl 180 
MRCAl I  90 


NNSAV*NN 
NNL*NN 
DO  2  1*1 «NN 
DO  2  J«! .NN 

C  INITIALIZE  SUMS  TO  ZERO 
2  S( I .J>*0.0 
C  REAO  INPUT  DATA 
REWIND  10 
TAPE* I  APE 
M*0 

5  CALL  ROIT 

IF (MI ) 3 I .95.31 
55  WRITE! 10) (X(I ).I*2,NN) 

IF (NOPO) 5008.5006. 5003 
S006  PRINT  5506.M. (X( I ) . I*2.NN) 


MRC A i 200 
MRc A  1210 
MRC A  1 220 
MRCAl 230 
MRC A  1240 
MRC A 1 250 
MRCAl 260 
MRCAl 270 
MRCAl 280 
MRCAl 290 
MRCAl 300 
MRCAl 31 0 
MRC A  1 320 
MRCAl 330 
MRCAl 340 


5506  FORMAT  < 1 H  I 4« 2X.9FI 3«6/< 7X« 9F 1 3*6 ) ) 
GO  TO  7 

5008  IF(NOPO-1 >7.5007.7 
5007  PRINT  6.M. (X ( I ) . 1*2. NN> 

6  FORMAT ( 1 H  . 1 4 ♦ 2X . 7E 1 7. 8/ ( 7X « 7E 1 7 . 8  >  ) 

7  DO  3  I *2«NN 

C  GENERATE  TRIANGULAR  SUMMATION  MATRIX 


MRCAl 350 
MRC A  1 360 
MRCAl 370 
MRCAl 380 
MRCAl 390 
MRC  A  1 400 
MRC A  14  10 


SII«I)*SII«1)+X(!) 

DO  3  J* I *NN 

3  SI  I ♦ J)*S( I »U)+XI I )*X(J> 

GO  TO  5 

31  PRINT  594.PGLB 

I F I  I M-2 ) * (L I MOB-M l ) 2097 . 2095 . 2095 
2097  PRINT  2096 

2096  FORMAT  1 34MI TOO  FEW  OR  TOO  MANY  DATA 
GO  TO  5400 
2095  PRINT  IS 

15  FORMAT ( 1 7M0SUMMAT I  ON  MATRIX) 

SIlil >«M 
RECM*| .0/5 1 1.1 > 

C  FORM  SYMMETRICAL  NORMAL  MATRIX  A 
00  9  1*1. NNN 
LOTI  I )*0 
DO  9  J* I .NNN 
All » J)*SI I . J) 

At J. I )*St I « J) 

c  form  matrix  which  saves  a 

A<PII.J)-S< I. Jl 
AKPIJ.I  )*SU«J) 

9  CONTINUE 

00  33  I ■ 1 .NNN 

33  PRINT  l6.IA||.J).Ja|«NNN).SII.NN) 
PRINT  16. IS  I J.NN ) • J* 1 »NN ) 


MRC A  1 420 
MRC A  1 430 
MRC  A 1 440 
MRCAl 450 
MRC A  1 460 
MRCAl 470 
MRCAl 480 

POINTS  )  MRC  A 1 490 

MRC A  1500 
MRCAl 510 
MRCAl 520 
MRCAl 530 
MRC A 1540 
MRCAl 550 
MRCAl 560 
MRCAl 570 
MRCAl 580 
MRCAl 590 
MRCA1600 
MRCA161 0 
MRCAl 620 
MRCAl 630 
MRCAl  640 
MRCAl 650 
MRC A 1660 
MRCA1670 
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16  FORMAT! 1  HO ♦ 7E 1 7. 8/ ( I  X . 7E 1 7.8  )  > 

DO  679  1=1 «NNN 
B(  !>=S( J.NN) 

C  SAVE  CONSTANTS 
679  BB ( I ) =8 ( I ) 

C  INVERT  A  AND  SOLVE  NORMAL  EQUATIONS 
96  NR=NR-1 

I F ( NNNSAV >500.501 .500 
BO  1  1 F ( ( I VORGO-3  >  * ( I VORGO-4 ) i 500 . 502 , 500 
•502  IF  (  ISI0.EQ.2  1  IB  IDS" 3 
500  KOUNT  =  0 

call  abt 

I F ( IBIDS»E0.3  > IB I  DS  =  1 
2050  I F (ERROR  >  221.5660.833 
833  IF (NNNSAV12008. 2008.83 
2008  I  SK  IP  =  2 

GO  TO  660 

C  1 S< I P*2  MEANS  NO  FINAL  COMPREHENSIVE 
5660  IFINNNSAVJ660.660.39 
660  DO  60  1=2. NN 

C  COMPUTE  THE  AVERAGES  OF  EACH  VARIABLE 

60  AVV< I ) =S ( 1 . I >/S(  1 . 1 ) 

IF(NOBS.EQ.A) ISKJP*2 
NPED=NPE 

PRINT  61  .  (  I .AVV( 1  +  1) . 1*1 ,NNN) 

61  FORMAT (57HC AVERAGES  OF  INDEPENDENT  VARIABLES  AND  DEPENDENT 

1E/I6I I3.E17.8 ) ) ) 

I F  < ERROR ) 39 .39. 83 
39  IF (NOR+MVP >83.83. 62 

62  PRINT  594.PGLB 

I F ( NNNSAV  >61 1  .611  .2023 
2023  PRINT  5760. SELECT . (LOT ( I ) . I ■ 1 .NNNSAV ) 

5760  FORMAT (32H0 INDEPENDENT  VARIABLE  SELECTION  .AS. IX. 51  ID 
611  IFINDRJ63. 63.44 

44  PRINT  64 

f.4  FORMAT  (32H0SELECTED  INPUT  DESIGN  POINTS...) 

C  RCISK  READS  DATA  FROM  DIS*(OR  BINARY  T APE )FOR  USE  AS  SELECTED  DATA 
C  INPUT  OBSERVATIONS  AND  CALLS  PREVAR  TO  COMPUTE  PREDICTED  VALUES  AND 
C  PREDICTION  STANDARD  DEVIATIONS. 

CALL  RD I  SKI KOUNT. INOX) 

63  IF<MVP>7202.7202.46 
46  REWIND  l l 

45  PRINT  43 

43  FORMAT (27H0SYNTHET |C  DESIGN  POINTS.. • I 
DO  80  K« I .MVP 

C  READ  IN  POINTS  FOR  STANDARO  DEVIATION  OF  PREO I CT ION. COMPUTE  AND 
READ  Ill  )  ( X  1 1 ) • I *2  «L 1 M ) 

CALL  PREVAR (KOUNT. INOX) 

IF (KOUNT ) S3. 83. 7201 
IF (MVPL >3022.2022.3022 
PRINT  82 « (K . YY (K ) » YSOEV (K )  »K«  1 .KOUNT) 


80 
7202 
7201 
202  2 
82 


MRCA1680 
MRCA1690 
MRC A  1 700 
MRCA1710 
MRCA1720 
MRCA1 730 
MRCA1 740 
MRCA1750 
MRCA1 760 
MRCAI770 
MRCA1780 
MRCA1 790 
MRCA1800 
MRCA181 0 
MRCA1820 
MRCA1B30 
MRCA1840 
MRCA1850 
MRCA 1 860 
MRCA1870 
MRCA 1 880 
MRCA I  890 
MRCA 1900 
MRC  A  1 9 i 0 
MRCA 1920 
VARIABLMRCA 1 930 
MRCA 1 940 
MRCA195C 
MRCA 1960 
MRCA 1970 
MRC A 1 980 
MRC A 1990 
MRCA2000 
MRCA2010 
MRCA2020 
MRCA2030 
MRCA2040 
MRCA2050 
MRCA2060 
MRCA2070 
MRCA20BC 
MRCA2090 
MRC A2 100 
MRCA21 10 
MMCA2120 
WRITE. MRCA2130 
MRCA2I40 
MRCA2I50 
MRCA2I60 
MRCA2I70 
MRCA218C 


STANDARD 


FORMA T(90MO ITEM  NUMBER .PREDICTED  VALUE. AND  PREDICTION 

I  I A  T I  ON  FOR  INDIVIDUAL  OBSERVAT I ONS/ (3 < l 5.2EI 7.8 ) ) ) 

GO  TO  83 

3022  PRINT  86. (K.YY(K) .YSDEV(K) «K*| .KOUNT ) 

36  FORMAT (36H0 ITEM  NUMBER. PREDICTED  VALUE. ANO  PREDICTION  STANDARO 

I I  AT  I  ON  FOR  THE  PREDICTION  L!NF/<  3 ( IS » 2E 1 7.8 ) ) ) 

S3  IF (NR ) 1 07 « 1 03 . 84 

RESET  MATRIX  DIMENSIONS 
34  CALL  T I  ME (XV  IT) 


DEVMRCA2190 
MRCA2200 
MRC 422 10 
MRCA2220 
0EVMRCA2230 
MRCA2240 
MRCA2250 
MRCA2260 
MRCA2270 
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4009 

FORMAT <4HARUN« 1 5  •  5H  TOOK , FI  3 .8 • 9H  SECONDS*) 

MRCA2280 

PRINT  A 009* I RUN. XY IT 

MRCA2290 

CALL  SFTIT 

MRCA2300 

n=nsav 

MRCA2310 

NNsNNSAV 

MRCA2320 

NNN.NEN 

MRCA2330 

C  FORM  NEW  MATRIX  A  WITH  SMALLER  DIMENSIONS 

MRCA2340 

DO  701  1=1 *51 

MRCA2350 

701 

LOT ( I ) = 1 

MRCA2360 

READ  85 • (LOT (L ) «L. 1 »NNN ) 

MRCA2370 

85 

FORMAT (51 I 1 ) 

MRCA2380 

NNNSAV=NNN 

MRCA2390 

ERROR* 0.0 

MRCA2400 

CALL  reducm 

MRCA2410 

NN=N+2 

MRCA2420 

GO  TO  96 

MRCA2430 

107 

PRINT  507 

. 

MRCA2440 

S07 

FORMAT (29H0CAR0  TYPE  2  IS  INCORRECT. 

) 

NRCA2450 

GO  TO  5400 

MRCA2460 

1  03 

CALL  TIME(XYIT) 

MRCA2470 

PRINT  4009* IRUN.XYIT 

MRCA2480 

GO  TO( 126* 127* 127* 127) « 1 VORGO 

MRCA2490 

127 

NNNSAV*NEN 

NRCA2500 

GO  TO ( 1 26* 1 28  « 129* 1 26 ) « I VORGO 

MRCA2510 

128 

CALL  SET  it 

NRCA2520 

PRINT  2066 

NRCA2530 

2066 

FORMAT ( 1H2/65H  BEGIN  IVOR  R 

E 

GRESSION  CALC 

U  NRCA2540 

1L  A  T  I  0  N  S» 

NRCA2550 

SELECT. IVORS 

NRCA2560 

WR 1 TE  <  9  «  2066 ) 

MRCA2570 

2060 

FORMAT ( IH2» 1 I9X/7JM08  E  G  1  N  I  V  0 

R 

analysis  OF  V 

A  MRCA2560 

1RIANCE  TABLE  S.49X) 

MRCA2990 

CALL  IVOR 

NRCA2600 

CALL  T I  ME ( X I T  ) 

MRCA26I0 

PRINT  2093. XIT 

NRCA2620 

2093 

FORMAT (2IML IVOR  EXECUTION  TIME  .Fl I *9.9M  SECONDS*) 

NRCA2630 

GO  TO ( 1 26 « 126* 1 26. 129 ) *  1 VORGO 

MRCA2640 

129 

CALL  SET IT 

NRCA2650 

PRINT  2067 

MOCA 2660 

2067 

FORMAT (1H2/67H  BEGIN  BIVOR 

R 

EGRESSION  CAL 

C  MRCA2670 

IU  L  A  T  I  0  N  S) 

MOCA 2660 

SELECT.0 IVORS 

MRCA2690 

WRITE (9.2069) 

MRCA2700 

2069 

FORMAT (  1H2,  H9X/73H06  E  G  I  N  B  I  V 

0 

R  ANALYSIS  OF 

V  MRCA27I0 

1ARIANCE  TABLE  S.47X) 

MRCA2720 

CALL  BIVOR 

MRCA2730 

CALL  TIMEIXIT) 

MRCA2740 

PRINT  209*. XIT 

MRCA2790 

2096 

FORMAT (22HLB IVOR  EXECUTION  TIME  .Fit. 

B.9H  SCCONOS*) 

MRCA2760 

126 

CALL  INTVL(XIT) 

MRCA27T0 

ENO  FILE  9 

MRCA2760 

RE XI  NO  9 

MRCA2T90 

ENO  FILE  13 

MRCA2000 

REM  I NO  13 

HRCA26I0 

2062 

REAOI9.2061 ILET 

MRCA2O20 

2061 

FORMAT  USAS) 

MRCA2030 

IF (EOF (XY|T 1 )GO  TO  2060 

MRCA2040 

2063 

PRINT  2061 .LET 

MRCA2O90 

GO  TO  2062 

MRCA2660 

2069 

REWIND  9 

MRCA2670 
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2094  READ( 13*2061 }LET 

IF<E0F(XYIT))G0  TO  2092 

2091  PRINT  2061 .LET 
GO  TO  2094 

2092  RE VI NO  13 


MRCA28B0 

MRCA2B90 

MRCA2900 

MRCA29I0 

MRCA2920 


4008 

2010 


221 

3400 


CALL  INTVL<X|T)  MRCA2930 

PRINT  4006 * X I T  MRCA2940 

FORMAT (24 HL COMPREHENSIVE  PR  INTOUTS «F 1 3.9. 9H  SECONDS.)  MRCA2950 

CALL  XLOK (X 1 T )  MRCA2960 

PRINT  2013. xiT  MRCA2970 

2013  FORMAT (44H4T0TAL  PROBLEM  RUNNING  T I ME (HRS# /MIN. /SEC. >■ t AS )  MRCA2960 

GO  TO  5401  MRCA2990 

STOP  MRCA3000 

RETURN  MRCA3010 

ENO  MRCA3020 

SUBTYPE. FORTRAN, LMAP.PBIN  ABTOOOOO 

SUBROUTINE  AST  ABT  0010 

COMMON  A (31 .5! ) .BS0EVI31 J.BI260I ) »YY(700C) »X(52 ) .XD(51 )  ABT 

COMMON  A VV ( 32 ) . YSOCV ( 7000 ) . AM (Si S  »RECM, NOR . MVPL . NNNSAV. NNN.LOT 151 )ABT 
COMMON  NNL. DETERM, NOBS. TOLRS. TOLCES. ERROR. NPED. 1TOTAL.N.NOPO. ICASEABT 
COMMON  PSSMO. ISKIP.NJ(2S ».M«,FIRM(7).KNUM.KMUM,MB.M| ,N0(2S) . 10  ABT 
COMMON  NNXA.NNSAV. SDEV. AKP(51 »5l ) .88  <32 ) . S (51 .32 ) «PGLB ( 10)  ABT 

COMMON  |N(*9, 10) . IR. IS. Ml . JLIM.NN.M.NTAPE  ABT 

COMMON  SELECT. 1310. IBIOS  ABT 

DIMENSION  ESTEP(30).I»RE0(31 >.FGRAPHC63).IOBF(30).CMPFRC30)  ABT 


DIMENSION  CH| (30) 

DIMENSION  EVYI7000) 

DIMENSION  LITIS2) 

E OU I valence  1 1 OGO . NOBS  I 

EQUIVALENCE (LI TI 1  )  . B ( I  396 ) ) 

EQUIVALENCE! ESTEP (1 ) .8(201  1 1 • ( |FREO( 1  I »B (231 >) 
EQUIVALENCE (FGRAPhi 1 j ,6(262  I  1 • ( I OBF ( 1 ). 9(327) ) 
EQUIVALENCE (CMPFR (|  » .3 (337 | ). ICHJ (I  ) .6(367)  > 
EQUIVALENCE  (EW, vSOEV) 


731 


967 

745 

998 

999 
.73 


17 


37 

16 


04  T  A 
DA^A 
DATA 
CALL 

call 

call 

PRINl 


BLANK (6M 
XXX(«mx 

zzzi  bm« 

I NTVL ( X  I T  > 
GAUSS 
I NTVL (XIT) 
987. I  CASE  < 


XIT 


FORMAT ( | 9HQMA TRIM  INVERSION  .IA.2IH  ...EVALUATION  TIME 

i  secoNos.i 

IF (ERROR  1 1 06. 998. 106 

IF (OtTCRM H 0*. 1 06*999 

PRINT  33. OE TERM 

FORM A T ( | 3nOOE TE RM | NANT • . G I B ) 

PRINT  A  INVERSE  AND  SOLUTION  TO  SIMULTANEOUS  EQUATIONS 
PRINT  | 7 

FORMAT (S9H0 INVERSE  OF  MATRIX  A  ANO  SOLUTION  TO 
IONS  I 

oo  r  i  *  i.appi 

PRINT  16. IA( | • 

FORMAT (TCI  7.8) 

SSR*0. 

00  20  I.I.MN 
SSR«SSR*6(  t  HMl  1 1 
SSf«S(»*6..»P6.|-SS® 

ATSS«Sl*P6..MNLfS(|. 

ASSP«ATSS-SSE 


1.8(11 


)4(S(|.NNL|/S(|.| It 


ABT 
ABT 
ABT 
ABT 
ABT 
ABT 
ABT 
AST 
AST 
ABT 
ABT 
ABT 
ABT 
ABT 
ABT 
ABT 

•.FI3.B.9HABT 

ABT 
ABT 
ABT 
ABT 
AST 
ABT 
ABT 

simultaneous  EOUATIABT 

ABT 

A0T 

ABT 

ABT 

ABT 

ABT 

ABT 

ABT 

ABT 

ABT 


0020 
0030 
0040 
0050 
0060 
0070 
0080 
0090 
0100 
0110 
0120 
0130 
0140 
OISO 
0160 
0170 
01  BO 
0190 
0200 
0210 
0220 
0230 
02*0 
0230 
0260 
0270 
0280 
0290 
0300 
0310 
0320 
0330 
0340 
0330 
0360 
0370 
03B0 
0390 
0400 
0410 
0420 
0430 
9440 
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CORSO-ASSR/ATSS  AST 

IF  ( COO SO ) 1 09*22*23  AST 

23  COR«SQRT (CORSQ)  AST 

FN*NNN  AST 

IF  <  S  M  «  1  ) .EQ.FN )GO  TO  31  AST 

SOE VSQ= SSE/ ( S ( 1*1  )-FN|  AST 

IF ( SOEVSQ) 108*24*24  AST 

31  SDEVSO*0.0  AST 

2*  SOEV«SCPT(SOEVSO>  AST 

DO  21  t«l.NNN  AST 

IF(AM.I) 1996 *99? #997  AST 

997  BSDCV ( I ) »SOEV*SOPT (AM.!))  AST 

21  CONTINUE  AST 

CO  T0(I50«15li 150 ) « IB I  OS  AST 

150  CALL  IDENTM  AST 

CALL  PRINTM  AST 

IF  < IOGO.GT.I ICO  TO  151  AST 

GO  TO  ( 1 51 *191 t ' 52 1*16105  AST 

152  PRINT  1 53  AST 

1H3  FORMAT (65H0NO  IDENTITY  MATRIX  CHECKS  WILL  BE  MADE  ON  SUBSEQUENT  B1 AST 


I VOR  RUNS. )  AST 

151  PRINT  58. ( 1 *SSOEV( I )«I*1 ,NNN)  AST 

58  FORMAT  (  <35H0STAN0AR0  OEVlATlON  OF  COEFF  » C  i  >*  NTS  )/(6  (  J  >«E  1 7, 8  )  >>  AST 

:  TME  G  FORMAT  IS  USED  TO  PRINT  THE  MAXIMUM  NUMBER  OF  £UN!FICANT  DIG  I TSABT 
;  IN  TME  GIVEN  NUMBER  OF  COLUMNS.  AST 

1074  PRINT  574, SSE  AST 

574  FORMAT  MHO.  G18.35H  RESIDUAu  OR  ERROR  SUM  OF  SQUARES. >  AST 

1075  PRINT  575.ATSS 

575  FORMAT  (IH  .  G18.45H  TOTAL  SUM  OF  SQUARES  AOJ 

1  I 

1076  PRINT  576.ASSR 

576  FORMAT  flH  ,  618.50m  REGRESSION  SUM  OF  SQUARE 

1  MEAN. I 

1077  PRINT  57 7, COR 

577  FORMAT  (|H  ,  G18.30M  CORRELATION  COEFFICIENT 

IC78  PRINT  578. SOE V 

578  FORMAT  MM  •  G18.35K  SQUARE  ROOT  OF  RESIDUAL 


575  FORMAT  MH  .  G18.A5M  TOTAL  SUM  OF  SQUARES  ADJUSTED  FOR  THE  MEAN.A8T 

1  1  AST 

1076  PRINT  576.ASSP  ACT 

576  FORMAT  MH  ,  618.50m  REGRESSION  SUM  OF  SQUARES  ADJUSTED  FOR  THE  AST 

1  MEAN. I  AST 

1077  PRINT  ST  7, COR  AST 

577  FORMAT  MH  .  G18.30M  CORRELATION  COEFFICIENT  (R).)  AST 

IC78  PRINT  578.S0EV  AST 

578  FORMAT  mm  ,  GI8.35H  SQUARE  ROOT  OF  RESIDUAL  VARIANCE.)  AST 

CALL  CMRR|ASSR.SSE.N.M,COR*S»PGlOI*LOT,NNNSAV.M4)  AST 

IF (NNNSAV >2083.2084.2083  AST 

2084  RSSMO.ASSR  *»T 

ESSO*  SSE  AST 

NR |  •R-NNN  AST 

NRO  ■  N  AST 

WRITE ( 1 3 .1093 >PRLS  AST 

2093  F0RMATMMI.104S.39X)  4ST 

WRITE  < 1 3.20*5 )NR)  49 T 

2095  F0RM4TMH0*39H0en««es  OF  FREEDOM  OF  ERROR  VARIANCE  •  .I4.T6X)  AST 

WRITE  « 13.2088 1  AST 

20SS  FORMAT MMO.TSMCOCFFtCtCMT  OF  NO  (OF)  OF  F  FOR  RCSART 

I  SESSION  ON  |NDEFEN0ENT«4|X/|M  .  S4HDC TERR IMA T ION  DELETED  VAST 

2ARI ASlES  OELETEO  VARiASLES  VARIASLE  SELECTION, 3SX/ltOX)  AST 

WRITE ( | 3.2090 ICORSO  ART 

2090  FORMATMMO.FS.T.IIN  -  MAIN  RUN , 99  X  )  ART 

GO  TO  59  ART 

2093  CALL  F|k  ART 

IFiRSSMO)  5A.20M.208A  ART 

2086  NR2  4*6*0  «N  ART 

F0uT4|RSSN0>ASSR)A(FL0AT(lS2|))/<CftftD«FL0AT()Rla))  ART 

IF  (Mil  .EO.O  >FOVTMS»mt9S9.fM  ART 

WRITEI  I3.20R9)CORSO.MR2*FOUT.LIT  ART 


0450 
0460 
0470 
0480 
0490 
0300 
051  C 
0523 
0330 
0540 
0550 
0560 
0570 
0580 
0590 
0600 
0610 
0620 
0630 
3640 
0650 
0660 
06  *0 
0680 
0690 
0700 
0710 
0720 
0T30 
0740 
0750 
0760 
0770 
07(1C 
0790 
0800 
0810 
y820 

0S30 

0840 

0850 

0860 

0#7© 

0880 

0S90 

0900 

09»0 

0920 

0930 

0940 

0950 

0960 

0970 

0900 

0990 

1000 

1010 

1020 

1021 

1030 
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2089 

r ORM  *  T ( 1 M  .F9.7. 13X. I2.22X.F14,3.7X.9*A|  ) 

ABT 

1040 

GO  TO  39 

ABT 

1030 

54 

IE (RSSM071 .0)33.59.59 

ABT 

1060 

55 

*RITE< I 3.2093 )PGLB 

ABT 

1070 

WRITE  (13*36) 

ABT 

1080 

56 

FORMAT* I HO .80HNO  FINAL  COMPREHENSIVE  PRINTOUT  SINCE  MATRIX 

FOR  MAIA6T 

1090 

IN  RUN  COULD  MOT  BE  INVERTED  .39X1 

ABT 

1100 

RSSM0--1 .0 

ABT 

1110 

5  i 

I F  (  NPED  >99  *  3660  •  99 

ABT 

1  120 

.  compute  PREDICTION  error 

ABT 

1130 

59 

OEwINO  10 

ABT 

I  140 

ES2  •  0.0 

ABT 

I  ISO 

25 

DO  26  K>1 «M 

ABT 

I  160 

28 

READ  (  10  1  (X(  I  )•  I.2.4RISAV) 

ABT 

1170 

IF (NMWSAV >29.29. 7280 

ABT 

1  180 

7280 

NNw«2 

ABT 

1  190 

DO  7299  I  «2 .  I  TOTAL 

ABT 

1  2C0 

1’ (LOT ( l  )  1104.727*7299 

ABT 

1210 

727 

X ( NN» 1  *  x ( I ) 

ABT 

1220 

720 

NNVaNNW+1 

ABT 

1230 

'29  9 

CONTI NUF 

ABT 

1240 

2« 

YY  (to  30  (  1  1 

ABT 

1230 

DO  3C  I-2.NNN 

ABT 

1260 

?C 

YVOO  «YV  IK  |*Xl  1  )*BI  1  ) 

ABT 

1270 

EYY  (X  )3X(4P4SAV)-YY(K) 

ABT 

1280 

£S2«  ES2  EVV(Xi*EVYIK) 

ABT 

1290 

26 

continue 

ABT 

130C 

REWIND  10 

ABT 

1  3 1  0 

Z  COMPUTE  04 NOE  or  ERRORS 

ABT 

1320 

CALL  MAxM|N(N.EYY.EYVU»EVYL  . ixmax. IXM1N) 

ABT 

1330 

rnr  rr 

8MINE  ANO  plot  DISTRIBUTION  of  ERRORS.  PERFORM  CHI  SQUARE 

TEST 

ABT 

I  340 

C  IT  possible 

ABT 

1350 

22 

r RANGE  «  E^yu-Evyl 

ABT 

1  360 

COCl^A  «  E«ANGE>30*0 

ABT 

1370 

ESTEP m  •  EYYL  ♦  EOCt ta 

AST 

1380 

1FBE01 1 i  •  3.0 

ABT 

1390 

CO  ?3C 3  I  1*2*30 

ABT 

1400 

1 FREO ( 11).  0 

ABT 

l«10 

2  3  C  3 

ESTEP (  I  i  )  •  ESTEP(  M-l  14CDELTA 

ABT 

1420 

ESTEp  t  30 ) •  eYYU 

ABT 

1430 

IFRfOiJl )•  0 

ABT 

1440 

DO  2334  !!•!•* 

ABT 

1430 

UJ«  (EYYf II 1-EYYLl/fOELTA 

ABT 

1460 

200* 

)FRfo« uu*i )4iF»eot ju*i hi 

ABT 

1470 

I»RCQ(30).|FReot30)4|PREOf7l i 

ART 

!  4F“ 

C-ll  V. «ISO(N, tSTEP.IFReO.Sll  *1  l.SOEV.IOF. CHI. CHISUM.I08F.C* 

•PFR) 

ABT 

1490 

2082 

1F,NOPO-| 15026. 3027. 3026 

ABT 

1800 

*:■?* 

PRINT  *94 .POL B 

ABT 

1910 

594 

FORMAT (IMI.IOAtl 

ABT 

1920 

Hi  naPhSAv  >2014.2313*2014 

ABT 

1930 

251* 

PRINT  5760. SELECT, ILOT ILIOl .L IO«l .NNNSAV) 

ABT 

15*0 

576" 

FORMAT  I32H0 1 NOEPCNOENT  VARIABLE  SCLECTION  *A8*|X*S|I1) 

ABT 

1590 

23:i 

PRINT  5326. (X.VY(K).EYY(K)*K*| «M) 

ABT 

15*0 

5426 

FORMAT ( (A9H3ITEM  NUMBER  PREDICTED  VALUE  ANO  PREDICTION  ERROR  >✓ 

ABT 

1570 

1  <3< l*»2F|S*6) ) ) 

ABT 

15*0 

PRINT  SS28.ES2 

ABT 

1590 

PRINT  594. POLO 

ABT 

1*00 

IF (NNNSAV120I 7, 2010*2017 

ABT 

1610 

20 1 T 

PRINT  5760.  SELECT.  I  LOT  ILIOI  «HO.|  .MNNSAV1 

ABT 

1620 

2016 

PRINT  2009.ERAM0C 

ABT 

1630 
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2035  FORMAT {  40H0PRE0 !CT I ON  ERROR  FREQUENCY  DISTRIBUTION  / 


ABT 


640 


1  9H  RANGE  «  FlS*4«  / 

228H  UPPER  BOUND  FREQUENCY  2X* 

3  9HBAR  CHART* 61  X.3HCHI  .3X.6H08S  FR.3X.7HEXPD  FR  > 
2038  DO  2032  II  *  1.30 
IFGRPH* IFREQj I  1 1 
I FGRPM  *M INC (69. IFGRPH) 

IF ( 1FGROH)2024. 2026*2024 
2  324  DO  £025  tFG*  1 . tFGRPH 

2025  FGRAPH* iFGl-XXX 

202 6  IFGRPH*  tFGRPH  +1 

IF* IFGPPH-66 >2033*2034.2034 

2034  FCRAPH ( 65 ' »ZZ2 
GO  TO  2035 

2033.00  2027  IFC  *  IFGRPH *63 

2027  FGRAPH  (IFG>*BLAN*: 

2035  IF(NDPO-l >2028*2030*2028 

2028  IF(CHI  { I  I  > >2040.2041  *204| 

2040  PRINT  2029*EST£P{  I  I  )  *  IFREOM  I  >«FGRAPM 
GO  TO  2032 


ABT 

ABT 

AST 

ABT 

ABT 

ABT 

ABT 

ABT 

ABT 


650 

660 

670 

660 

69C 

700 

710 

720 

730 


ABT  1740 
ABT  1750 


ABT 

ABT 

ABT 

ABT 

ABT 

ABT 

ABT 

ABT 


760 

770 

780 

790 

BOO 

810 

B20 

830 


2(14  1  PRINT  2029«E$T£P  (  l  I  1  .  f  FREQ(  1  I  )  .  FGRAPH.CH  I  (  I  1  >«  IOBF(  1 1  J  *CMPFR(  |  I  > 
2023  FORMAT (2X.F15.4.2X* 13.6X, IHI ,65At . 1X.F8.3* IX. I5.2X.F9.3) 

GOTO  2032 


ABT  1 640 
ABT  1850 
ABT  1B60 


2030  1F(CH1  i 1  I  > >2042.2043.2043 

2042  PRINT  2031  ,cTTEP(  1  I  )  ,  1FREQMI  1  .FGRAPM 
GO  TO  2032 

2043  PRINT  2031  . ESTEP ( I  I ). I  FREQ t I  I  I.FGRAPH.CHl ( 1 1) . I OBF 1 1 1) .CMPFR ( 1 1 > 

2031  F  ORMAT (2X.E>3*8.2X. 15. 6X . I H I .6SA I . 1 XcF*.3, 1 X. I S.2X .F9.3 > 

2032  CONTINUE- 

IF ( IDF  12048.2048.2049 

2048  PRINT  2050 

2050  FORMAT ( 1 X.31 HCH1 SQUARE  COULD  NOT  BE  COMPUTED) 

GO  TO  5660 

2049  PRINT  2039* CHI  SUM. IDF 

2039  FORMAT C12H  CHISQUARE  ■  F15.3.22H  DEGREES  OF  FREEDOM  «  15  > 

GO  TO  5660 
5027  PRINT  594.PGLB 

IF ( NNNSA V > 20 1 9. 20 1 6 . 20 1 9 

2019  PRINT  5760 ♦ SELECT . I LOT <L  I Q )  « L  I  Q*  I . NNNSAV  > 

2018  PRINT  75.  (K.YY(K)  »EYY  <K>  .KM  ,M) 

75  FORMAT! (49M0 ITEM  NUMBER  PREOICTEO  VALUE  AND  PREDICTION  ERROR  )/ 

I (3 ( I5.2E15.6) ) ) 

PRINT  5528 .ES2 

5526  FORMAT (27H0CHECK  ERROR  SUM  OF  SQUARES  /1M  .GIB) 

PRINT  594.PGLB 
IF (NNNSAV >2021 .2020*2021 
2021  PRINT  5760 .SELECT* (LOT (LIO) *LIQ*| * NNNSAV) 

2020  PRINT  2006«£RANOe 

2006  FORMAT (  40H0PREDICT I  ON  ERROR  FREQUENCY  DISTRIBUTION  / 

1  9H  RANGE  ■  E15*6«  / 

228H  UPPER  BOUND  FREQUENCY  2X» 

3  9H8AR  CHART «  61 X .3NCH I  *  3X .6H08S  FR.3X.7HEXPD  FR > 

GO  TO  2038 
j 04  PRINT  504 

504  FORMAT (36H0 A  RERUN  CARO  IS  «A0E  UP  INCORRECTLY) 

ERROR— 1  .0 
GO  TO  5661 
I C6  PRINT  505 

505  FORMAT  (25H0MATR1X  FAILED  TO  INVERT*) 


ABT 

1870 

ABT 

1880 

ABT 

1890 

ABT 

1900 

ABT 

1910 

ABT 

1920 

ABT 

1930 

ABT 

1940 

ABT 

1950 

ABT 

I960 

ABT 

19  70 

ABT 

1980 

ABT 

1990 

ABT 

2000 

ABT 

2010 

ABT 

2020 

ABT 

2030 

ABT 

2040 

ABT 

2050 

ABT 

2060 

ABT 

2070 

ABT 

2080 

ABT 

2090 

ABT 

2100 

ABT 

2110 

ABT 

2120 

ABT 

2130 

ABT 

2140 

ABT 

2150 

ABT 

2160 

ABT 

2170 

ABT 

2180 

ABT 

2190 

ABT 

2200 

ABT 

2210 

ABT 

2220 

GO  TO  83 


ABT  2230 
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1  08 

PRINT  506 

ABT 

2240 

506 

FORMAT (22H0VAR1 ANCE  IS 

NEGATIVE. ) 

ABT 

2250 

GO  TO  63 

ABT 

2260 

109 

PRINT  503 

ABT 

2270 

303 

FORMAT (55H0THE  SQUARE 

OF 

THE 

CORRELATION  COEFFICIENT  IS  NEGATIVE. 

>  ABT 

2260 

GO  TO  83 

ABT 

2290 

996 

PRINT  995 

ABT 

2300 

995 

FORMAT (67HCAN  ELEMENT 

OF 

THE 

MAIN  DIAGONAL  OF  THE  INVERSE  MATRIX 

I  ABT 

2310 

1 S  NEGATIVE.  ) 

ABT 

2320 

83 

ERROR* 1 .0 

ABT 

2330 

GO  TO  5661 
5660  ERROR*0.0 


ABT  23«0 
AST  2350 


5661  RETURN 
END 

T  SUBTYPE. FORTRAN, LMAP.PBIN 

SUBROUTINE  8 IVOR 

C  B IVOR-BACKWARD  I VOR- INDEPENDENT  VARIABLE  SELECTION  SUBROUTINE  FOR  THE 


ABT  2360 
ABT  2370 
B I VOR 000 
BI V0R010 
BI VOR020 


C  ORDERING  OF  INDEPENDENT  VARIABLES  ACCORDING  TO  MAGNITUDES  OF 
C  REGRESSION  SUMS  OF  SQUARES. 

COMMON  A (51 .51) » BSOEV ( 3 1 ) .B  <2601 ) ,YY<7000 ) .X (52 ) »XD (51 ) 


BI V0R030 
B  1  V 0*30*0 
BIVOR050 


COMMON 

COMMON 

COMMON 

COMMON 

COMMON 

COMMON 


AVV  <  52  )  .  YSDEVI7000  )  ,AW(51 )  «  RECM.  NDR  »  MVPL  ♦  NNNSAV ,  NNN.-LOT  (51  )6! 
NNL, DETERM, N I BS.TOLRS.TOLCES. ERROR «NPED. I  TOTAL .N.NDRO, ICASEBI 


C 

c 


RSSMO, ISKIP.NJI25) ,M4,FIRM(7) .KNUM.FMUM.MB.MI .NO (25 >«  JO  BI 
NNXA .NNSAV.SDEV, AKPISl . 3 1 ) « BB (52 ) . S 1 52 . 52 ) .PGLBI 10 )  8 1 

IN (49, 10) , IR, IS. Ml , JLIM.NN.M.NTAPE  BI 

SELECT, IBID, IBIDS  BI 

DIMENSION  LAT (51 )  BI 

EQUIVALENCE  (LAT.XD)  BI 

EQU I  VALENCE (NIBS. I DGO)  BI 

M4  COUPLED  WITH  THE  VARIABLE  LL ( l N  CMPR )  CONTROLS  THE  PRINTING  OF  BI 
ANALYSIS  OF  VARIANCE  TABLES.  BI 

NOBS* I  BI 

SAVE  NPFD  BI 

NT  APE  =  NPEO  BI 


* 

VOR070 
VOR080 
VORO90 
VOR 1 OO 
V0R1 10 
VOR 120 
VOR 130 
VOR 140 
VOR 1 50 
VOR 1 60 
VOR 170 
V0RI80 
VOR 1 90 


knum=-i 

C  KNUM*-1  lets  CASSR  KNOW  THAT  B1V0R<  INSTEAD  OF  IVORHS  BEING  USED. 
NPED* 1 

ASSIGN  551  TO  ISEE 

M4*0 

GO  TO ( 1 .22) . 1SKIP 
1  WRITE( 13. 103) 

103  FORMAT (56H0*#*  BIVOR  FINAL  COMPREHENSIVE 
1  .64X/120X  > 

22  00  101  1*1.51 

1 0  I  LOT ( I ) *0 
I  TOTAL* 1 
DO  102  1*1, MB 
102  I  TOTAL* I  TOT AL+NQ ( 1  ) 
l START* I  TOTAL ♦ I 
IF < I  START-51 ) 106* 107 .107 

106  DO  105  I *1  START .51 
105  LOTH  |*l 

107  00  200  I  *  I , MB 
1DUM*0 

I  TOTAL* 1  START- 1 
I  START  *  I  START  -A<0  (  1  ) 

NOO*NQ( I  1 
JSAVE-t TOTAL 
JLOT* I  START- 1 
00  600  X*1 *  MOO 


BI VORiOO 
BI VOR2IO 
BI V0R220 
BI VOR230 
BI VOR240 
BI VOR250 
B I VOR260 
***B I VOR270 
BI VOR280 
BIV0R290 
B I VOR300 
Bt VOR310 
BIVOR320 
BI VOR330 
BI VOR340 
BIV0R350 
BIVOR 340 
BI VOR370 
BIVOR 3B0 
BI VOR390 
BI VOR40U 
BIVOR410 
B I VOR420 
BI VOR430 
BIVOR440 
BIVOR450 
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SO  TO  I  SEC 
951  ERROR-1. 0 

CALL  PEDUCM 
CALL  CASSRC IDUM.KSO) 
SO  TO <500 *50 l ) • KSO 
501  LOT ( JSAVE )  »1 

J5AVC«J5AVE-1 
GO  TO  600 


BIVOR440 
B1VOR470 
81  VOR4BO 
BI VOP490 
BIVOR500 
BIVOP510 
8 1 VOR520 
Bt  VOR530 


SCO  JLOT- I  START-1 
(CASSR^O 


B 1 VOR540 
BIVOR550 


DO  300  J«t START* I  TOTAL 
ir (LOT( J» 1301 *301 *300 
301  JL0T-JL07+1 
KASSRXASSR+l 

AW(KASSR)«B< JLOT)* <B< JLOT )/A< JLOT* JLOT) > 
LAT<KASSR)»J 
300  CONTINUE 

204  IF (KAS5R-1 1221 *400*404 
400  IXM(N«1 

GO  TO  402 

404  CALL  MAXMIN<KA5SR«AW«AMAX*AM!N, IXMAX* IXMIN) 
402  IMAX*LAT<IXMIN> 

IF<  IB1D.NE*2*0R.  lOSO.NE.MGO  TO  100 

IBIDS-3 

IBJD-1 

lOO  GO  TO <524 *526) .NOBS 

524  IF { NNN— NNNSAV 1525  «  526*525 

525  ERROR-O.O 
09  CALL  REDUCM 

CALL  ABT 

IF< IBIDS.E0.3)IBIDS»2 

nped«ntape 

I F  <  NNN— 2 1220.220. 526 

526  LOT ( I  MAX ) ■ I 
NOBS *2 

npeo«ntape 

ERROR-O.O 
CALL  REOUCM 
CALI.  ABT 

IF (  IBI0S.EQ.3 ) 18  IDS  =  2 
202  IF <NNN-2 >220*220.599 
599  ASSIGN  500  TO  ISCC 
60C  CONTINUE 
200  CONTINUE 

220  RETURN 

221  STOP 
END 


B1VOR560 
01 V 09570 
BIVOR5BO 
81 VOR590 
81VOR600 
SI VOR610 
BIV0R620 
81 VOR630 
BI V0R640 
BI VOR650 
B I VOR660 
BI V0R670 
BI V OP 680 
BI VOR690 
BI V0R700 
BIVOL710 
BIVOR720 
BIVOR730 
BIVOR740 
BIVOR750 
BIVOR760 
BI V0R770 
BI VOR780 
BIV0R790 
BIV0R800 
BIVORfllO 
BI VOR620 
B I VQR830 
BI V0R840 
BI VOR05O 
B1VOR860 
BIVOR870 
BI VOR880 
0IVOR89O 
BIVOR900 
BIVOR910 
B1VOR920 


SUB TYPE. FORTRAN. LNAP.PB IN 
SUBROUTINE  CASSR (KASSR.KGO) 

COMMON  A <51 ,51 1 .BSOEV <51 > ,8 < E60 1 1 « YY (7000 1 t* (3£ I *X0 ( 51 > 
COMMON  AVV( 52 ) « YSDEV (7000 ) c AW (51 > *R£CB*NOR«RVPL*>PRISAV« 
COMMON  NNL, DETER*. NOBS, TOLRS.TOLCES* ERROR «NPCD« ITOTAL.N, 
COMMON  RSSMO. tSF:R,NJ{25),M4«FIRM(7>.KNUM«KMUM«MB*Mt  »N0( 
COMMON  NNXA«NNSAV,S0EV«AKP(5t *31 > ,Bfl (32 1 *3(52*32 ) «PSLS( 1 
COMMON  IN (49* 1 0 ) » !R. IS«*1 »JLIM.NN,M*NTAPE 
COMMON  SELECT, IB  10* IB  IDS 
DIMENSION  EYY ( 7000 ) 

EOU I  VALENCE  <  EYV « VSOCV 1 
EQUIVALENCE! I 060, NOBS) 

EOU I  VALENCE  <  TOL 1 2  «  TOLCES ) 


CASSROOO 
CASSR010 
CASSR020 
.LOT (31 1CASSR030 
NOPO, ICASECASSR040 


25)  *10 
01 


CASSR050 

CASSR060 

CASSR070 

CA5SR080 

CASSR090 

CASSRIOO 

CASSRitO 

CASSR120 
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FNNNaNNN  CASSR130 

CALL  GAUSS  CASSR140 

745  IF (ERROR ) 1 06 «  998  t 1 06  CASSRI50 

998  W { OETERM ) 1 06 • 1 06*57  CASSR160 

57  SSR*0.0  CASSR170 

00  20  !»1«NNN  CASSR180 

20  SSR-SSR+e<  n*WB  (  I  )  CASSR190 

SSE«S<NNL#NNL)-SSR  CASSR200 

ATSS*S  (  NNL  *  NNL  )  —  (  (S  U  « NNL  1**2  >/S  <  i  •  i  >1  CASSR210 

ASSR-ATSS-SSE  CASSR220 

:  ISKlP  rs  SEV  IN  THE  RAIN  PROGRAM. ISK!P»I  if  RAIN  run  is  SUCCESSFUL**2  CASSR230 
:  MAIN  RUN  IS  UNSUCCESSFUL*  CASSR240 

GO  T0<579<580 »• ISKIP  CASSR250 

580  CORSQ* AS5R/ATSS  CASSR260 

IF ( COR SO ) 1 09 «  23  *23  CASSR270 

23  SOEVsa=SSE/<S< 1 ♦ 1 )~FNNN>  CASSR280 

1 F ( SDE VSQ 1108*  24 • 2*  CASSR290 

24  CO  21  I  ■  1  « NNN  CASSR300 

I F ( A ( I • I ) ) 996 • 2 1 • 2 1  CASSR31 0 

21  CONTINUE  CASSR320 

999  CALL  IOENTM  CASSR330 

GO  TO  <  579*579  *579* 17)* IDGO  CASSR340 

579  KASSR=KASSR+1  CASSR350 

AW(KASSR)«ASSR  CASSR360 

:  ASS9  -  REGRESSION  SUN  OF  SQUARES  ADJUSTED  FOR  THE  MEAN  CASSR370 

;  KGOsj  MEANS  VALIO  ASSR  WAS  COMPUTED  *2  INVALID  ASSR  CASSR380 

KGO*l  CASSR390 

GO  TO  221  CASSR400 

17  IF ( KNUM+ 1)19*18*19  CASSR41 0 

18  IF(NNN.EQ.NNN5AV)G0  TO  19  CASSR420 

CALL  REDUCM  CASSR430 

CALL  ABT  CASSR440 

KGO*2  CASSR450 

GO  TO  221  CASSR460 

19  PRINT  2009 ♦ TOL 1 2  CASSR470 

2009  FORMAT ( 79H0DEV I  AT  I  ON  OF  A  RAIN  DIAGONAL  ELEMENT  IN  THE  IDENTITY  NACASSR480 

1TRIX  LARGER  THAN  I(2)»  «G9*13H  »RUN  REJECTED*)  CASSR490 


1TRIX  LARGER  THAN  I (2 ) a  *G9*l3H  »RUN  REJECTED*) 

GO  TO  83  CASSR500 

1 04  PRINT  110  CASSR51 0 

110  FORMAT (32H  I  VS  CONTAINED  NEGATIVE  ELEMENT*)  CASSR520 

GO  TO  63  CASSR530 

99^>  PRINT  995  CASSR540 

995  FORMAT (67H0AN  ELEMENT  OF  THE  MAIN  DIAGONAL  OF  THE  INVERSE  MATRIX  I CASSRS50 


IS  NEGATIVE* ) 

60  TO  83 
106  PRINT  505 

>5  FORMAT  (25H0MATRIX  FA 1 LEO  TO  INVERT*) 
GO  TO  83 

108  PRINT  506 

506  FORMAT (22H0VAR1ANCE  IS  NEGATIVE*) 

GO  TO  83 

109  PRINT  503 


CASSR560 

CASSR570 

CASSR580 

CASSR590 

CASSR600 

CASSR6I0 

CASSR620 

CASSR630 

CASSR640 


503  FORMAT (55H0THE  SQUARE  OF  THE  CORRELATION  COEFFICIENT  IS  NEGAT I VE. )CASSR650 

83  PRINT  2089«  (LOTI  I ) « I>t ,NNN$AV)  CASSR660 

2089  FORMAT (6H  !VS»  * 5 1 II  I  CA5SR670 

KG0«2  CASSR6G0 

221  RETURN  CASSR690 

END  CASSR700 

r  subtype*fortran*lmap*pbin  chisoogc 

SUBROUTINE  CH 1  SO < N , ESTEP ♦ I  FREQ . OB  •  SDEV *  I DT «CH I . CMI SUM  *  I OBFR ,  CHISQ010 
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c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


icmpfr ) 


CMISQ020 


THIS  SUBROUTINE  FITS  A  NORMAL  CURVE  WITH  MEAN  0  AND  STANDARD  DCV I  AT  I ONCH I S0030 
SOEV  TO  THE  DATA  IN  I  FREQ  WHERE  THE  UPPER  BOUND  OF  EACH  INTERVAL  IS  INCHI S0040 
THE  CORRESPONDING  ENTRY  IN  ESTEP.  N  IS  THE  NUMBER  OF  INDEPENDENT  CMISQ050 
VARIABLES.  OB  IS  THE  NUMBER  OF  OBSERVATIONS.  CHIS0060 
THE  ROUTINE  GROUPS  THE  DATA  SO  THAT  THERE  ARE  AT  LEAST  5  COMPUTED  CHIS0070 
VALUES  IN  EACH  INTERVAL  AND  THEN  COMPUTES  THE  CHI  SQUARE  STATISTIC  TO  CHISQ080 
GIVE  AN  EST ’ MAT  I  ON  OF  THE  GOODNESS  OF  FIT.  ON  EXIT  FROM  THE  ROUTINE  CHISQ090 

IDF  CONTAINS  THE  NUMBER  OF  DEGREES  OF  FREEDOM.  CHISUM  THE  CHISOUARE  CHISQIOO 
VALUE.  AND  CH I  ( J )  CONTAINS  A  -1  IF  THE  JTH  INTERVAL  WAS  NOT  THE  LAST  CMISGUO 
OF  A  GROUP  OTHERWISE  IT  CONTAINS  (OBSERVED  FREQUENCY-THEORETICAL  CHISQI20 
FREQUENCY )**2/THEORETICAL  FREQUENCY*  IF  THERE  IS  AN  INSUFFICIENT  CHISQI30 
NUMBER  OF  OBSERVATIONS  THE  FIT  IS  NOT  ATTEMPTED  AND  ALL  OUTPUT  VALUES  CHISQ140 


ARE  SET  TO  -1  EXCEPT  FOR  IDF  WHICH  WILL  BE  -CN+3). 

I 08F ( J )  CONTAINS  ON  EXIT  THE  OBSERVED  FREQUENCY  IF  THE  JTH  INTERVAL 
WAS  THE  LAST  OF+A  GROUP.  OTHERWISE  ITS  CONTENTS  ARE  MEANINGLESS. 
LIKEWISE  CMPFR(J)  CONTAINS  THE  THEORETICAL  FREQUENCY. 

OD I  MENS  I  ON  ESTEP (30 > . IFREQ <30 ) .CHI (30) 

ODIMENSION  I OBFR (30 ) .CMPFR (30 ) 

FOT  *0.0 
KOUNT  *  0 
CHISUM- 0.0 
PROBO.O.O 
FO»0.0 

IF ( OB/5 .0— FLOAT (N 1-3.0  )  1 « 1.3 

1  JJ«! 

CHISUM. -1 ,0 
CHI (30».-1.0 
GO  TO  14 

3  DO  10  J»1 .30 

IF ( J.NE .30 )GO  TO  1 1 
PROBN.l ,0 
GO  TO  2 

11  PRO0N.FREQ ( ESTEP ( J ) /SDE V ) 

2  FOC.OB* (PROBN— PROBO ) 

FO=FLOAT( IFREQ(J) )+FO 

IF (FOC— 5. 0)4.4.S 

4  CHI ( J).-I .0 
GO  TO  10 

5  FOT.FOT+FO 

REMA I N«OB* ( 1 .O-PROBO) 

IF (OB* ( 1 .O-PROBN >-5.0 >6.6.7 

6  FO«FO+(O0-FOT) 

FOMRE*FO— REMA I N 

CHI (30 >«fomre*fomre/remain 
I OBFR (30 ).FO 
CMPFR (30 | ■REMAIN 
CHI SUM.CHt  SUM+CH1 (30 » 

KOUNT  «  KOUNT  Tl 
JJ.  J 
GO  TO  12 

7  IF ( J-30 19.8.9 

8  FOC-REMAJN 

9  CHI ( J  )  *  ( FO-FOC ) *#2/F0C 
CHISUM.  CHISUM  ♦  CHICJ) 

I OBFR  <  J ) »FO 

CMPFR (J)»FOC 
KOUNT*  KOUNT  ♦  1 
FO  •  0.0 
PROBO  ■  PROBN 


CHISQ150 
CH1SQ160 
CHISQ170 
CH I SQ 180 
CHISQ190 
CHISQ200 
CHISQ210 
CHISQ220 
CHISQ230 
CHISQ240 
CHISQ250 
CHI SQ260 
CHISQ270 
CHISQ280 
CHIS0290 
CHISQ300 
CHIS0310 
CHISQ320 
CHISQ330 
CHISQ340 
CHISQ350 
CHISQ360 
CHI SQ370 
CHISQ380 
CHISQ390 
CHI SQ400 
CH I SQ4 1 0 
CHISQ420 
CHISQ430 
CHISQ440 
CHISQ450 
CHISQ460 
CHIS0470 
CHI $0480 
CH | SQ490 
CHISQ500 
CHISQ510 
CHISQ52C 
CH I SQ53C 
CHIS034C 
CHISQ55C 
CH( SQS6C 
CH | SG37< 
CHI SQ38< 
CM | SQS9( 
CH I SQ60I 
CH  I  SOS  1 1 
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1  0 

CONT  ini  IE 

CHI SQ620 

GO  TO  17 

CHI SQ630 

12 

IF! JJ-29) 14, 14. 1  7 

CHI SQ640 

14 

DO  16  J*JJ.29 

CHI SQ650 

1  6 

CHI ( J)  s  -1.0 

CHI SQ660 

1  7 

IDF  a  KOUNT  -  N  -  3 

CHIS0670 

RETURN 

CHI SQ680 

END 

CHI SQ690 

SUBTYPE, FORTRAN, LMAP.PB IN 

CMPROOOO 

SU8R0UT I NE  CMPR ( RS5M ,ESS , IP »N . CORR , B  * PGLB • LOT • NNNSAV 

.44)  CMPR0010 

DIMENSION  B ( 5 1  1 , J  <  52  > • PGLB ( 1 0 ) .LOT ( 51  1 

CMPR0020 

COMMON  DUM (25458) .SELECT 

CMPR0030 

DIMENSION  L  I  T ( 52 1 

CMPR0040 

DIMENSION  FORM  ( 9  1  .BCD A ( 4 ) . BCDB ( 4 ) . BCDC (4 1 .FORM2 ( 8 ) 

CMPR0050 

EQUIVALENCE  ( DU.M  ( 4  048  )  ♦  L  I T  (  1  )  )  ,  (DUM  (4100  )  .  J  (  1  )  ) 

CMPR0060 

DATA (FORM (I  ) «  I  a  1 ,9  )  (0H (5H0Y  ■  .9HE20.14.  .8H 

,eH(3H  +  ,  .8HCMPR0070 

1E20. 14.3.0HH  X( • I2..8H1H) ),  . 8H  »8HX> 

)  CMPROOBO 

DATA (3CDA< I  ) ,  I  *  1  . 4  )  ( 1 H 1 , 1 H2 , 1 H3 , 1 H4 ) 

CMPR0090 

DATA  OCOB ( I  )  ♦  I  *1 .4  )  ( 1H0.2H66.2H37.2H  81 

CMPROIOO 

DATA (BCDC ( I ) «  1  =  1 ,4  )  (2H88.2H58.2H29, 1  HI  > 

CMPROl 1 0 

DATA (F0RM2 ( I ) , 1*1 .  8><BH{4X«  ,8H  .8HI2H+  , 

E2. 8H0. 14. 3H  »  CMPRO 12  0 

18HX( ,  12 . 1H.8H) « IX)  •  «8H  «8HX)  ) 

CMPROl 30 

DATA  BLANK <6H  ) 

CMPROl 40 

DATA  L ( 9 1 

CMPRO 150 

I F ( M4 >88.87.88 

CMPRO 160 

fS7 

LL=  1 

CMPRO 170 

M4*  1 

CMPRO 180 

88 

R*IR 

CMPRO 190 

AN  =  N 

CMPP0200 

RSQUOT=RSSM/R 

CMPR0210 

OMR-AN-R-J . 

CMPR0220 

NOMR=OMR 

CMPR0c30 

IF (OMR.EQ.O.O )GO  TO  200 

CMPR023I 

ESQUOT*ESS/OMR 

CMPR0240 

FOuOT*RSOUOT /ESQUOT 

CMPR0250 

GO  TO  201 

CMPR025I 

230 

ESOUOT  =  0.0 

CMPR0252 

F  QUO  T  * 9999999999 • 9999 

CMPR0253 

201 

IRCT-IR+1 

CMPR0254 

62 

I F ( NNNSAV ) 38 . 63 ♦ 64 

CMPR0270 

63 

LL*2 

CMPR0280 

K» 1 RCT  + 1 

CMPR0290 

00  56  I>2«K 

CMPR0300 

56 

J  (  I  ) *  I -1 

CMPR0310 

70 

WRITE  (L.l IPGLB.tiLANK 

CMPRO 320 

21 

WRITE  (L ,3 1 

CMPR0330 

85 

WRITE  (L.86) 

CMPR0340 

22 

WRITE  (L«4  ) I R ,R5SM. RSOUOT .FQUOT .BLANK 

CMPR0350 

DO  23  1-1*2 

CMPR0340 

23 

WRITE  IL.5) 

CMRR0370 

WRITE  < L, 6 >NOMR» ESS, ESQUOT, BLANK 

CMPR0380 

WRITE  (L ♦ 7 ) CORR .BLANK 

CMPR0390 

84 

DO  27  1*1 , I RCT ,4 

CMPR0400 

LAST  » I +3 

CMRR0410 

IF (LAST-1RCT )51 .31 ,32 

CMRR0420 

52 

last«ipct 

CMRR0430 

51 

IW-LAST-IWI 

CMPR0440 

IF ( I-I  )33.53, 34 

CMPWJ0450 

53 

FORM  (3)>BCDA( IW-l ) 

CMPR0460 

FORM  (d)>BCOetlW) 

CMRR0470 
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WRITE  <L.  FORM  >0(1  )  «  (B  (  K  )  « J  (K  l,K*2lLAST>  CMPR04B0 

GO  TO  27  CMRP0490 

F0RM2(2)*BCDA( IW>  CMPR0500 

FORM2 < 7 ) «8CDC ( I W 1  CMPR0510 

WRITE (L,F0PM2) (8(K> ,J(K  ),K*1.LAST>  CMPR0520 

CONTINUE  CMPR0530 

RETURN  CMPR0540 

1*0  CMPR0550 

CALL  FIX  CMPR0560 

DO  1 0 1  JKm | «  NNNSAV  CMPP0570 

IF (LOT ( JK  )  1 1  01 » 1 00 • 1 01  CRPR0580 

1*1+1  CMPR0S90 

J(I)*JK-1  C RPR 06 00 

CONTINUE  CMPR0610 

GO  TO  ( 6S  «  66 >  «LL  CMPR0620 

WRITE  (L. 1 >PGLB, BLANK  CMPR0630 

GO  TO  80  CMPR0640 

DO  67  1*1,6  CMPR0650 

WRITE  (L.2I  CMPR0660 

WRITE(L»B1SELECT,LIT  CRPR0670 

WRITE  (L, 83  )  CMPR0680 

LL*3— LL  CMPR0690 

GO  TO  85  C RPR 07 00 

FORMAT ( 1  HI , 1 0A8  «  39X/1 ] 9X , Al 1  CRPR0710 

FORMAT (JH  ,11  9X  )  CRPR0720 

FORMAT (9H0RA IN  RUN, 111X1  CRPR0730 

FORMAT! 1 1 H  REGRESS  1 0N,20X« I  5 , 1 X , F*0. 09, 1 X »F20» 09 • 1 X , F20. 09 . 20X , A 1  JCMPR0740 
FORMAT  < 1 1 HOREGRESS I  ON, 1 09X )  CRPR0750 

FORMAT (6H0ERROR,25X, J3, IX, F2C, 09, IX,F20.09,41X, Al 1  CRPR0760 

FORMAT ( 12HOCORRELATION«4X,F1 0,9,93X«A1 >  CRPRC770 

FORMAT (32H0 INDEPENDENT  VARIABLE  SELECTION  «A8« 1 X.52A1 »27X )  CRPR0780 

FORMAT! 100X,5H  ,15X)  CRPR0T90 

^CRMAT ! 1  HO , 33X, 2H0F , 1 3X , 2HSS , 1 9X , 2HRS , 1 9X , 1 HF , 30X )  CRPR0800 

END  CRPR08 1 0 

SUBTYPE, FORTRAN, LMAP.PBtN  FIX  000 

SUBROUTINE  FIX  FIX  010 

COMMON  AI51 ,51 ),BS0EV(5l )*B<2601 1 * YT (7000 > # X (52  I *XD (3 1 >  FIX  020 

COMMON  AW! 52  >  ,YSDCV!7000  )  ,  AW!5|  )  «RECR»  NDR  »MVPL  , NNNSAV, NNN, LOT  (SI  >FIX  030 
C OMRON  NNL , DETERM , NOBS , TOLRS , TOLCES • ERROR , NPEO , I TOT AL , N , NDPO, 1 CASEF IX  040 
COMMON  RSSMO, ISKIP,Nj(23),M4,riRM(7),KNUM,KRUR,R8*MI,N0(25),lQ  FIX  050 
COMMON  NNXA,NNSAV,S0EV,A<P(EI ,51 1 ,BB (52 ) ,S (52,52 1 ,PGLB!10)  FIX  060 

DIMENSION  LIT  !S2 )  FIX  070 

EQUIVALENCE (LIT! 1 » ,3(13961 J  FIX  080 

DATA  K2ER0< IHO) ,KONE( 1M1 J ,KBLANM( IM  >  FIX  090 

DO  3  1*1, NNNSAV  FIX  100 

IF ( LOT (I  )|2,2,1  FIX  110 

L I T ( I >  »KONE  FIX  120 

GO  TO  3  FIX  130 

L I T ( I )*KZERO  FIX  140 

CONTINUE  FIX  150 

DO  4  I*NNSAV«52  FIX  160 

LIT ( | )*KBLANK  FIX  170 

RETURN  FIX  180 

END  FIX  190 

SU8TYPE«F0RTRAN,LMAP,PSIN  6AUSSOOO 

SUBROUTINE  GAUSS  8AUSS010 

COMMON  A ( SI , SI  1 • I Rl VOT (5l ) ,B ( 31 |5I  ) , VV (7000 1 *X (S2 ) , XD (91  I  GAUSS020 

COMMON  AVV(5£) , YSDCV (7000 ) , AW (81 ) tPCCM* NOP ,MVPL, NNNSAV, NNN, LOT f SI 16AUSS030 
COMMON  NNL, OC TERM, NOBS «TOLP$, TOLCES »EPPOP«NPCO* 1 TOTAL, V.NOPO. I CA SE6AUSS040 
COMMON  QSSMO, lSKIP,NJ(29),M4,FfPM(7),KNUM,KMUM,MB,Mt«N0(29>,lQ  GAUSSOSO 
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COMMON  NNXA  «  NNS A  V  < SDE  V «  AKP ( 5 1  .51)  »  B8 ( 52 ) »  S ( 52  »  52  > ->PGLB (10) 
COMMON  !N<49*!0>*IR«I$«Mi  , JL I M , NN , Z . NTAPE 
COMMON  SELECT. IBID* IB10S 
DIMENSION  INDEX  <51  *2 ) 

EQUIVALENCE  (N.NNN) 

EQUIVALENCE  <YY(I >. INDEX (1 ) ) 

EQUIVALENCE  (IROW.JROW),  (ICOLUM.JCOLUM).  (AMAX.  T.  SWAP) 
INITIALIZATION 

error *0.0 

M*  i 

1 0  DETERMal .0 
15  DO  2  0  J  =  1  « N 
20  I P I VOT ( J ) *0 
30  DO  550  I  =  1  ,N 
1  GO=  1 

SEARCH  FOR  PIVOT  ELEMENT 

A0  AM  AX  =  0.0 
45  DO  1 C5  J=1 .N 

50  IF  < I P 1 VOT  <  J ) - 1  )  60.  1 05  «  60 
60  DO  1 OC  K  = 1 » N 

IF  (IPIVOT(K)-I)  80.100.899 
80  IF  (ABS(AMAX)-ABS(A(J.K ) ) )  05.  100.  100 

85  IR0W=J 
90  IC0LUM=K 
95  AMAX=A(J,K) 

I  G0  =  2 

l 00  CONTINUE 

1 05  CONTINUE 
GO  TO( 1 06. 1 10 ) » I  GO 

106  DETERM=0.0 
GO  TO  740 

110  I P I VOT ( I COLUM ) « l P I VOT  < | COLUM )  +  l 
IF  (A ( ICOLUM, ICOLUM) ) 1 30*899. 130 

INTERCHANGE  ROWS  TO  PUT  PIVOT  ELEMENT  ON  DIAGONAL 
C 

130  IF  ( I  ROW- 1 COLUM )  140*  260*  1*0 
140  D€ TERM* -DETERM 
150  DO  200  L *  1  * N 
160  SWAP* A ( IROW.L ) 

170  A(IR0W.L)»A(IC0LUM,L) 

200  A I I COLUM.L ) *SWAP 
210  00  250  L* 1 «  M 
220  SWAP*B ( I  ROW  «L ) 

230  B<IROW,L)«B(ICOLUM,L) 

250  B( ICOLUM, L ) -SWAP 

260  INDEX! I . I »«IROW 

270  INDEX! I *2>»ICOLUM 

310  PIVOT  *A< ICOLUM, ICOLUM) 

320  DETERM. DETERM4PIV0T 
C 

C  DIV10F  PIVOT  ROW  BY  PIVOT  ELEMENT 

C 

330  A< ICOLUM, ICOLUM)«I ,0 


GAUSS060 
GAUSS070 
GAUSS080 
GAUSS090 
GAUSS! 00 
GAUSS 1 1 0 
GAUSS  1 20 
GAUSS1 30 
GAUSS1 40 
GAUSS1 50 
GAUSS  1 60 
GAUSS1 70 
GAUSS  1 80 
GAUSS  190 
GAUSS200 
GAUSS2I 0 
GAUS5220 
GAUSS230 
GAUSS240 
GAUSS250 
GAUSS260 
GAUSS270 
GAUSS280 
GAUSS290 
GAUSS300 
GAUS5310 
GAUSS320 
GAUSS330 
GAUSS340 
GAUSS350 
GAUSS360 
GAUSS370 
GAUSS380 
GAUSS390 
GAUSS4O0 
GAUSS* 1 0 
GAUSS420 
GAUSS43C 
GAUSS44C 
GAUSS450 
GAUSS460 
GAUSS470 
GAUSS480 
GAUSS490 
GAUSS500 
GAUSS5I0 
GAUSS520 
GAUSS530 
GAUSc540 
GAUSs590 
GAUSS560 
GAUSS570 
GAUSS5S0 
GAUSS590 
G-1USS600 
GAUSS6I 0 
GAUSS620 
GAUSS630 
GAUSS640 


340  DO  350  L«l  « N 


GAUSS650 
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350  A ( 1C0LUM.L  >  *  A  C ICOLUM.L ) /PIVOT  GAUSS660 

360  DO  370  L*1«M  GAUSS670 

370  B ( I COLUM.L  >  *B  ( ICOLUM  «L )/P I VOT  GAUSS680 

C  GAUSS690 

C  REDUCE  NON-PIVOT  ROWS  GAUSS700 

C  GAUSS7I 0 

300  DO  550  LI *1 «N  GAUSS720 

390  !F(L1 -ICOLUM)  400.  550*  400  GAUSS730 


400  T»A (LI . ICOLUM)  GAUSS740 

420  A (LI « 1C0LUM)«0,0  GAUSS750 


430  DO  450  L=1 «N 

450  A(Lt«L)*A(Ll.L)-A(!COLUM,LJ*T 
460  DO  500  L«I «M 

500  B<L1 «L)*8(L1 .L )-B < ICOLUM.L )*T 
550  CONTINUE 
C 

C  INTERCHANGE  COLUMNS 

c 

600  DO  710  1*1 «N 
610  L*N+ 1  - I 

620  IP  ( I NDEX (L  « 1 )  —  I NDEX (L 1 2 ) 1  630.  ?i0«  630 
630  JRO  W  *  I NDEX ( L . 1  ) 

640  JCOLUM* INDEX (L .2) 

650  DO  705  K*1«N 
660  SVAP«A<K. JROW) 

670  AIK, JROW) *A (K.JCOLUM) 

700  AIK. JCOLUM) -SWAP 


GAUSS760 
GAUSS770 
GAUSS780 
GAUSS790 
GAUSS800 
GAUSS81 0 
GAUSSB20 
GAUSS830 
GAUSS840 
GAUSS8S0 
GAUSSB60 
GAUSSS70 
GAUSSB80 
GAUSS890 
GAUSS900 
GAUSS9 1 0 
GAUSS920 


705  CONTINUE  GAUSS930 

710  CONTINUE  GAUSS940 


740  RETURN  GAUSS950 

899  ERROR* 1.0  GAUSS960 

RETURN  GAUSS970 

END  GAUSS980 

T  SUBTYPE. FORTRAN, LMAP.PBIN  IDENTMOO 

SUBROUTINE  l OENTM  IDENTMOl 

COMMON  A (51 . 51 ) . BSDEV (51 ) «B 1260 11. YY (7000 ) .X (92 ) . XO 151)  I0CNTM02 

COMMON  AW  1 52 1 . YSDEV 1 7000 1.AWI51 ) .RECM. NOR.MVPL .NNNSAV .NNN.LOT (51  1IOENTM03 
COMMON  NNL,  DETERM,  NOBS.  TOL  I  1  .TOL12  .  ERROR.  NPEO.  I  TOTAL.  N,N0PO.  I  CASE  I  0ENTM<>4 
COMMON  RSSMO, 1SKIP«NJ(25) ,M4,F1RM(7> .KNUMtKMUM.MB.Ml .NO  (25) . 10  1DENTM05 
COMMON  NNXA , NNSAV , SDEV, AKP (51 .91 ) <BB ( 52 ) « S (52. 52 ) ,PGL6 (10)  IDENTM06 

DIMENSION  A IOENT (51 ,51 )  I0ENTM07 


EQUIVALENCE! IOGO, NOBS)  IDENTMC0 

EQUIVALENCE (A IOENT (1 ).YY(1 ) )  I DENTMQ9 


1  00  3  1*1 ,NNN 
DO  4  K«I . NNN 
SUMaO, 0 
DO  5  J-l.NNN 

5  SUM* SUM* A ( I « J )*AKP ( J.K ) 

AlOCNTI I ,K)*9UM 
4  CONTINUE 
3  CONTINUE 
IOGO*| 

00  7  1*1  .*••< 

GO  T0( 16.17). IOGO 

16  IP( ASS (61 OENT (1.1 )- 1  *0 )-T0L 1 1  17.8.6 
8  1060*2 

17  IF  ( AOS  ( A  IOENT  ( I  .  I  )  - 1  .0 ) -TOL,  12)7.  10. 10 
7  CONTINUE 

GO  T0(20. 220). IOGO 
20  00  13  l*t«N 


IDENTM l 0 
lOENTMl I 
I OENTM  j  2 
10ENTMI 3 
IDENTM I 4 
IOENTM15 
I0ENTM16 
IOENTM17 
I  OENTM 1 6 
IDENTM) 9 
IOCN1M20 

IOENT M2 1 
IOENTM22 
IOCNTM23 
IOENTM24 
IOENTM2S 
I0CNTM26 
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K-J4.J 

DO  13  J=F.NNN 

IF  <  AB  S  <  A  l  D£ NT  (  |  •  JH-TOLU  >14,13.13 
1*  IF< ABS<AIDENT(J« 1 M-TOLIl  >13,13»13 
13  CONTINUE 
GO  TO  220 
10  IOGO-4 


10ENTM27 

IOENTM28 

IDENTM29 

IOCNTM30 

IOENTM31 

I0ENTM32 

IDENTM33 


GO  TO  220 

15  GO  TO( 18.220 ) « IOGO 


1DENTM34 

IDENTM35 


18  !OGO*3 
220  «ETU»N 


I  DENT* 36 
IOENTM37 


T 

C 

c 

c 


c 


END  I0ENTM38 

SUBTYPE. FORTRAN, IMAP .PS IN  IVOR  000 

SUBROUTINE  IVOR  IVOR  010 

IVOR  -  INDEPENDENT  VARIABLE  SELECTION  SUBROUTINE  FOR  THE  ORDERING  OF  IVOR  020 

INDEPENDENT  VARIABLES  ACCORDING  TO  MAGNITUDES  OF  REGRESSION  IVOR  030 

SUNS  OF  SOUARES*  IVOR  040 

common  A (51 .31 > .BSDEVtSl >.B 12601 ) .YY 17000) #X<52 > ,XD<51 )  IVOR  050 


COMMON  AWI32I  .YSDEV(7000>«A»<51)  •  RECN* NOR  * MVPL *NNNSA V«NNN,LOT  (51  )  !  VOR  060 
COMMON  NNL  «  OETERM.NOBS  «  TOLRS, TOLCES. ERROR. NPEO. I  TOTAL .N.NOPO. I  CASE  1 VOR  0^0 


COMMON  RSSMO,  ISKIP.NJ125  ).Na«F!RM;7>  •  KNUM  •  (CMUN  «  «B  •  M I  .NO  125  >  ♦  10 


COMMON  NNxA , NNSA V , SDEV, A«P< 51 .31  >  »  BB (52 ) • S (32 , 32 ) «PGL3 ( I  0  >  IVOR  0=>0 
COMMON  I N (49* lO),IR«IS«Ml . JL I M ,NN, M.NTAPE  IVOR  ICC 
COMMON  SELECT. IBID, IBIOS  IVOR  :c 
DIMENSION  i_AT<31  )  IVOR  120 
EQUIVALENCE  (LAT.XO)  IVOR  13C 
data  T0LSS(«3E>6)  IVOR  14C 
IF< 10)500.300.501  IVOR  15C 


500  IO-NNSAV-3 
301  FOUNT  «  0 


IVOR  16C 
IVOR  170 


1G02-1  IVOR  ,e0 

SEE  NOTE  IN  BIVOR  ON  THE  USE  OF  M4,  IVOR  190 

M«*0  IVOR  20C 

GO  TO ( 1 .21.1 SF I P  IVOR  2IC 

1  VR 1 TE  < 1 3, l 03 )  IVOR  220 


103  FORMAT <36M0*4*  IVOR  FINAL  COMPREHENSIVE  •••IVOR  iJO 


1 , 64M/1 20X 1 

2  DO  101  I«2*5I 
101  LOTID'I 

I  TOTAL •» 

LOT (I ) «0 
00  200  I •! «M I 
I  START*  I  TOTALS 
I  TOTAL* I  TOTAL ♦  NJ( I ) 

20 |  NUM«2 

KASSRaO 

C  KASSR  COUNTS  THE  NUMBER  OP  ASSR-S  COMPUTED 
00  300  Ja | START* I  TOTAL 
IF<lOT< J) >301 *307*301 
30 |  LOT(J)aO 

C  IN  I VOR* ERROR* 1*0  MEANS  THAT  REDuCM  VILL  NOT  PRINT  IDENTIFICATION 
C  AND  IVS.ERRORaO.O  MEANS  PRINT* 

error a  I *0 

302  CALL  REOUCM 

3  CALL  CASSR(KASSR«KGO) 

308  KASSRaKASSR 

GO  TO (303 *3041* KGO 

303  LAT (FASSR )• J 

304  LOT ( J ) • I 
GO  TO  300 


IVOR  240 
IVOR  250 
IVOR  260 
IVOR  270 
IVOR  280 
IVOR  290 
IVOR  300 
IVOR  310 
IVOR  320 
IVOR  330 
IVOR  3*0 
IVOR  390 
IVOR  380 
IVOR  370 
IVOR  380 
IVOR  390 
IVOR  400 
IVOR  410 
IVOR  420 
IVOR  430 
IVOR  440 
IVOR  450 
IVOR  460 
IVOR  470 
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307 

NUM»NUM+ 1 

I V09  480 

300 

CONTINUE 

IVOR  *90 

IF (KASSR 1221 .227.20* 

IVOR  500 

202 

1F(NJ< I )— NUN) 221 .203.201 

IVOR  510 

203 

!F< J-MI >210.220.221 

IVOR  523 

210 

DO  211  L*2.1 TOTAL 

IVOR  530 

2’  1 

L0T|L)«0 

I  V  >9  5*0 

E 9909*0*0 

IVOR  ■S'vO 

CALL  REOUCM 

IVOR  560 

2  09 

CALL  ABT 

! VCR  570 

kount*kount-m 

VCR 

IF ( I0-X0UNT  >221 .220.200 

IVOR  5si  .• 

20* 

IF (KASS9-1 ) 22 1 .400**01 

IVOR  t 00 

*00 

1 XM  AX* 1 

i VOR  6 l 0 

GO  TO  *02 

IVOR  620 

*01 

00  229  J*2.KASS» 

i VOR  633 

IF  CABS ( (A«( 1  ) -AW ( J  ) ) /AW  1 1  1  l-TOLSS >229. 229. *04 

IVOR  6*0 

229 

CONTINUE 

IVOR  65C 

405 

!G02«? 

IVOR  660 

1 XMAX  *  1 

IVOR  670 

GO  TO  *02 

1 VO«  680 

AC* 

CALL  M»XNlN{KASS9.AW«A«AX.A»iN.  IKMAjj.  iXMIN) 

IVOR  690 

*02 

l*AX»LAT( IXMAX) 

IVOR  700 

LOT ( I WAX ) *0 

IVOR  710 

£9909*0.0 

IVOR  720 

CALL  REDUC* 

IVOR  730 

call  ABT 

IVOR  7*0 

KOUNT*KOUNT*l 

IVOR  75C 

|F< IO-KOUNT1221 .302.205 

IVOR  760 

502 

1 602*  1  G02-*2 

IVOR  7?o 

205 

GO  TO (202**08.220 .408 ) * IG02 

IVOR  780 

200 

CONTINUE 

IVOR  790 

223 

9ETU9N 

Ivor  aoo 

227 

PRINT  228 

IVOR  810 

228 

FORMAT <30n*NO  VALIO  ASSRS  VC RE  COMPUTED* ) 

IVOR  820 

GO  TO  220 

IVOR  83C 

408 

PRINT  4  I t  . CLOT (  I ) . 1  *  1 . NNNSAV I 

IVOR  8*0 

*1  1 

F0RNATC8H4PERFCCT  Fjr.lVS*  .51111 

IVOR  850 

GO  TO  220 

IVOR  860 

221 

STOP 

IVOR  870 

CNO 

IVOR  880 

SU8TYPC, FORTRAN. LMAR.R8IN 

M A  XM I NQC 

SUBROUTINE  MAxniN(N«A«AMAK.AMlN. IXMAX.IXMjN) 

NAXMJN0I 

DIMENSION  A ( N ) 

MAXM|N02 

AMAX*A ( 1 ) 

MAXM1N03 

A*|N*AM«mX 

MA  XM  |  NO* 

IXMAXal 

MAXMIN05 

I XM IN* I 

MAXM|N0fc 

IFCN.EQ.I )G0  TO  220 

maxmino^ 

00  1  1 *2. N 

MAXMIN08 

IF.AU  )*GC*AMAXIGO  TO  2 

MAXM|N09 

IF(AU  i.gt.aminigo  TO  | 

maxmim  0 

1 XM IN* 1 

MA  XM I N |  | 

AM | N*A I | ) 

MA  XM | N 1 2 

GO  TO  1 

MAXM | N | 3 

2 

AMAX* A ( | ) 

MAXM|N|* 

IXMAX* 1 

MAXMINI5 

1 

CONTINUE 

MAXMJN16 

220 

RETURN 

MA  XM  |  N)  7 

CNO 

MAXM|N|e 
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SUB  Type . P0PTR4N.LM4P.PBIN  PREVAROO 

SUBROUTINE  PREVAR (FOUNT .  I NDX  )  PREVAROl 

COMMON  A (51  .« I  ) .BSOEV (El  )  tB (26C 1  ) .  W  (7000  )  t X ( 52  > • XD ( 5 1  )  PPEVAR02 

common  AVV (52 1  .YSDEV (7C00  )  •  AW  <51  }  i  RECM.  NOR  .  MyP|_  •  NNNSA  J  ,  NNN  ,  LOT  (  5  1  }PREVAPQ3 

COMMON  NNL.DErERM,NOnS«TOLRS.TOLCES#ERROR«NCEO, ITCTAL  >N«NCPO.  ICASEPRt varca 
COMMON  RSSMC,  ISK  !P,NJ(25  1  .MA  IPM(  7  )  ,<NUM,  «MUM  ,  MB  ,  M  I  ,N0(25  >  .  10  PREVA'SQ^ 

common  NNXA  .NNSAV,  SDcV.  AKP(51 .51  )  «  BB  (52  )  »S  <52, 52  )  .PGL3  (  1  0>  RREvARCE 


COMMON  IN(49*10).IR.IS*M1 , JL 1M.NN, M.NTAPE 
COMMON  SELECT. IBIO, IB10S 
DIMENSION  xx (51 ) 

EQUIVALENCE  IXXI 1  )  «6< 1549 ) > 

KOUNT*KOUNT* 1 

IF (NNNSA V 1650.66.650 
JJJ- I 

DO  652  JJ*2 . NNXA 
I E ( LOT ( J J ) > 1 04 , 65 1 . 652 
JJJ* JJ Jv 1 


pr^v 

Ofif  V  A«J  r 
ORE  VA-  0  7 
PRE VAR  1 0 
ORE  V Ax  1  I 
PREVAR 1 2 
PREvAR 1 3 
PREVAR 1 4 
PRC VAD 1 5 
PREVAR 1 6 

X( JjJ) :X ' JJ)  PREVAR[7 

AVI JJJ ) t4VV( JJ)  PREv  AR 1 S 

IE< JJ- INOX  1654.654,652  PREVARlo 

J*JJJ  PREVAR2C 

XX( JJJ)tX(JJ)  PPEVAR21 

CONTINUE  DREVAR2C 

PRINT  70.KOUNT. ( XX ( 1 1, I -2, J1  PREVAR2J 

PCRMA7(2~  (.13.1NJ.  9  (  1  X  ,  E  1  2. 6  )  /  (  SX  .  9  (  1  X  ,E  1  2.6  1  1  1  PR  E  V  A  R  2  * 

00  468  ! *2 • NNN  PREVAR25 

VO  (1  1  ■>  X  (  i  1  •AV  (  I  )  PRE  VAR  Jt 

GC  TO  1066  VAR,?’ 

DO  68  1*2, NNN  DR-VAR2M 

XOI!)  *  X  (  I  >  -  A  VV  (  l  )  PREVAR..'  i 

PRINT  7 ;,*OUNT , (X( I > « ! -2. INOX 1  PREVALJC 

VY (<OUNT I *e ( J )  PREVARJI 

DO  67  I  m3  «  NNN  ORE  VAR  12 

VY 1 FOUNT  »  *VY (FOUNT  >  *X (  I  ) »B (  !  )  PRt  VAR  ‘  ■ 

T£M  XX  -C.C  PREVA^j*. 

00  81  I  «2  .NAP!  PRrvAR  lE 

00  81  J-2.  NNN  PRE  V  AR  ..'t 

T£MXX«  TEMXX*  A? 1 , J1*X0< ll«XD(J>  PREvAR;'- 

| E ( MV°L |8|2«01 I *012  PREVAR- 

YSO€V(«OUNT  »sSOEV»SORT( 1 • Ca»ECM*TEMXX 1  PRF  vAR  »v 

00  TO  BO  prLvara; 

YSOeVCKOUNT)«SCrV4SO»TlRrCMATtMXX  >  por  V»8«l 

RETURN  OUE.AR4, 

STO®  P9tvAR4> 

END  POE VAR 4 A 

SU6TVOE  *E0RTRAN,LM4R*PB  IN  PR|NT«o<3 

SUBROUTINE  POINTM  P4IYT-CI 

COMMON  a (51 ,51  I .BSOCV(3l  1 .8 12601 1 »VY (70001 •*(5? 1 **0(51 1  PR|NT«02 

COMMON  AVVIS2!  .YS0CV1  70001.4*151  1  .RECM, NOR  .mvPL  .NNNSAV.APAV.LOr  131  )P»|NtMC,. 
COMMON  NNL*  DC  TERM.  NCOS  *T0L1  l  »  TOL  I  2  .ERROR.  *10,  I  TOTAL  .N.NOPO.  1  CASE**®!  N’^c* 
COMMON  RSSMC.ISKlR.NJ(2«),MA,F|qM(Y»,XNUM,MMUM.Me.Ml,NOI28».!C  PRlNt*C 
COMMON  >#vx  A  «  NNSAV  *  SOC  V « AtCP  1 5 1  •  5 1  I  <  68  ( 52  1  *  S 1 52 « 52  1  •  P6U  J I  1 0  1  prinTmcs 

DIMENSION  A|0ENT(5t »5| 1  prjntvj' 

EQUIVALENCE  (  IOCO.NOBS1  °R  »  n  v0t' 

EQUIVALENCE  I  A  IOENT  (1  I.YYH  1  »  PR|N'*03 

00  TOM  .2.2.21.1000  PRlNTMl3 

PRINT  3  PRlNTMj  j 

FC  ~“%T  U0mOIO€NT|TY  MATRIX!  PR  I  NTM  j 

00  4  |*I.  »#4N  PRINTMU 

PRINT  5.  (A  IOENT  I  I  .5  I  .5*1  .»#•<)  PRJNTMia 
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5  FOP** AT  (  lH0,?£I7.8/(  t  X  , 7E  17*8)  >  PR 

00  to< I .7*8.9 ) , IOGO  PR 

1  PRINT  6.TCU  1  PR 

6  FORMAT (70H0DEV1 AT  IONS  CF  ALL  ELEMENTS  OF  THE  IDENTITY  MATRIX  SMALLPR 

1FR  THAN  !<!)«  .G9.15H  .PUN  ACCEPTED*)  PR 

GO  TO  220  i 

7  PRINT  IO.TOLI1 ,T0LI2  PR 

10  FORMAT (78H0DEVIAT JON  OF  A  MAIN  DIAGONAL  ELEMENT  IN  THE  IOENTITY  MAPR 

JTR1X  LARGER  THAN  Id).  ,G9,  21H  BUT  LESS  THAN  IC2>«  ,G9»15H  .RUNPR 

2  ACCEPTED.)  PR 

GO  TO  220  PR 

8  PRINT  11  «  T  OL I  1  PR 

11  FORMAT (84M0DEV I  AT  IONS  OF  ALL  MAIN  DIAGONAL  ELEMENTS  IN  THE  I DENT  I TPR 

1Y  MATRIX  SMALLER  THAN  III)*  .G9/6BH  DEVIATION  OF  AN  OFF-DIAGONAL  EPR 
2LEMENT  LARGER  THAN  1(1 ). RUN  ACCEPTED*)  PR 

GO  TO  22C  PR 

9  PRINT  12.T0LI2  PR 

12  FORMAT (79HODEVIATION  OF  A  MAIN  DIAGONAL  ELEMENT  IN  THE  IDENTITY  MAPR 

1TRIX  LARGER  THAN  I<2>«  .G9.15H  .RUN  REJECTED.)  PR 

220  RETURN  PR 

END  PR 

SUBTYPE. FORTRAN, LMAP.PBIN  RD 

SUBROUTINE  RDISKCKOUNT. INDX)  RD 

COMMON  A (51 .51 > ,BSDEV(5l > ,B (2601 ) ,YY (70C0) «X<52 ) ,XD(5l >  RD 

COMMON  AW  (  52  )  ,  YSDEV  (  70C0  )  «  AW  (51  )  «RECM,  NDR  ,  MVPL  ,  NNNSA  V  .  NNN  ,  LOT  (31  )RD 
COMMON  NNL, DETERM, NOBS. TOLRS « TOLCES , ERROR ♦ NPED . I  TOTAL .N.NDPO, ICASERD 
COMMON  RSSMO, ISKIP,NJ(25 ) . M4 , F IRM ( 7 ) , KNUM , KMUM , MB ,M I . NQ (25) . IQ  RD 
COMMON  NNXA,NNSAV.SDEV.AKP(5l ,5| > ,BB(52) ,S<52.52> ,PGLB( 10)  RD 

COMMON  IN(49, 1 0) . IR, IS.M1 , JLIM.NN.M.NTAPE  RD 

COMMON  SELECT, IBID, IBIDS  RD 

DIMENSION  I KEEPR ( 999 )  RD 

EQUIVALENCE  ( B ( 1 602 ) , I KEEPR ( 1 ) )  RD 

REWIND  10  RD 

I  ST  ART  * 1  RD 

DO  1  1*1. NDR  RD 

IWHICHs IKEEPR ( I J  RD 

NUMBER* IWHICH-ISTART  RD 

IF (NUMBER ) 2, 3, 4  RD 

4  DO  1 1  J»1 , NUMBER  RD 

11  READ  dO|  SKIP  RD 

3  REA0( I C j ,X(K ) ,K»2,NNSAV)  RD 

CALL  PREVAP(<OUNT, INDX)  RO 

ISTART=IWH1CH7I  RD 

1  CONTINUE  RD 

GO  TO  5  RD 

2  STOP  RD 

5  RETURN  RD 

END  RD 

SUBTYPE, FORTRAN, LMAP.PBIN  RD 

SUBROUTINE  RD I T  RD 

RDIT-A  PROGRAM  TO  READ  TAPE  OR  CARDS  AND  COMPUTE  HIGHER  ORDER  RD 

PRODUCT  TERMS  OF  THE  DATA,  RD 

COMMON  A ( 31 .51 ) , BSDEV (5l),B(260I),YY (7000 ) »X ( 32 ) »XD (51)  RD 

C OMMON  AVV ( 52 ) , YSDEV  <  7000 ) ♦ AW (31), RECM, NDR , MVPL . NNNSA V . NNN , LOT ( 5 1  ) RD 
COMMON  NNL, DETERM, NOBS, TOLRS, TOLCES , ERROR, NPED, I  TOTAL , N.NDPO. ICASERD 
COMMON  RSSMO, ISXIP.NJ(25 ) , M4 . F IRM ( 7 ) »KNUM , KMUM , MB , M I , WO (25 ) . IQ  RD 
COMMON  NNXA,NNSAV«SDEV,AXP(51 ,51) ,BB(52 ) ,S(52.52) ,PGL9< 10)  RO 

COMMON  IN(49, 10) , IR. IS, Ml , JLIM.NN.M,  TAPE  RD 

COMMON  SELECT, IBID, (BIDS  RO 

DIMENSION  Y ( 52 ) 


NTM  1 5 
NTM16 
NTM  1  7 
NTM  18 
NTM  19 
NTM20 

NTM21 
NTM22 
NTM23 
NTM24 
NTM25 
NTM26 
NTM27 
NTM28 
NTM29 
NTM30 
NTM3t 
NTM32 
NTM33 
NTM34 
NTM35 
SK  00 
SK  01 
SX  02 
SX  03 
SX  04 
SX  05 
SX  06 
SX  07 
SX  08 
SX  09 
S..  10 
SK  1  1 
SX  12 
SX  13 
SX  1* 
SK  15 
SX  16 
SX  17 

sx  is 

SX  19 
SX  20 
SX  21 
SX  22 
SX  23 
SX  24 
SX  25 
SX  26 
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EQUIVALENCE (Y( 1  >,B(53>  > 


EQUIVALENCE (LIM.NNXA ) 

RDI t  ii 

EOU I  VALENCE (KNUM.NUM)  ,  ( KMUM *  MUM ) 

RD I T  12 

INTEGER  TAPE 

RD  |  T  13 

3 

00  33  J*1 .DLIM.NUM 

RDI T  14 

J1 «* J+MUM 

RD I T  15 

1F< JLIM-J1  111.10,10 

RD I T  16 

1  1 

U1  3 JL I  M 

RD  I  T  17 

10 

IE  (  J-l 18.8.0 

RD I T  18 

8 

REA0(TAPE.FIRM)M1 , (Y(J2) ,J2=J, J1 1 

RD  I  T  19 

IF (Ml >2,33.2 

RD 1 T  20 

9 

READ ( T  APE  «  F 1  PM ) M2 ♦ ( Y 1 J2 1 , J2 * J , J 1  1 

RD 1 T  21 

33 

CONTINUE 

RD i T  22 

X(NN>*Y( 1 ) 

RD I T  23 

Y  ( 1  1-1  . 

RDI T  24 

4 

M«M+1 

RD I T  25 

IF ( 181200,200. 100 

RDI T  26 

1  00 

DO  5  K  *  1  » 1 S 

RD  I  T  29 

KK* IR+K+l 

RD I T  28 

Y(KK  )*!  . 

RD I  T  29 

DO  5  L 3 1 .10 

RD I T  30 

INDEX*  1NOC.L  1 

RDlT  31 

5 

Y (KK 1 =Y ( KK 1 *Y ( I NDEX  > 

RD I T  32 

200 

DO  6  J=2.LIM 

RD I T  33 

6 

X(J)*Y( J> 

RD  I  T  34 

2 

RETURN 

RD I T  35 

END 

RD I T  36 

T 

SUBTYPE.FORTRAN.LMAP.PB  IN 

REDUCMOO 

SUBROUTINE  REDUCM 

REDUCMq i 

COMMON  A (51 ,5 i 1 .BSDEV (51 > .BI2601 1 « YY (7000 1 • X ( 52 ) . XD ( 5 1 1 

REDUCMQ2 

COMMON  AW  (  52  1  ,YS0EV(7000  )  .AMM51  )  .  RECM.  NDR  ,  MVPL  «  NNNSA  J «  NNN  .LOT  (51  1REDUCM03 

COMMON  NNL. DETERM, NOBS. TOLRS.TOLCES. ERROR, NPED« I  TOTAL . N , NQPO.  I  CASEREDUCmOa 

COMMON  RSSMO. I SK I P . NJ { 25 1 . M4 , F I RM < 7 1 , KNUM • KMUM , MB , M I , NQ (25  1  » I Q 

REDUCMOs 

COMMON  NNx A  «  NNSA V , SOE V » AKP ( 5 1  . 5 1  > , SB ( 52 1 . S ( 52 , 52 ) • PGL9 (1C) 

REDUCMot 

COMMON  !N<49.10).IR,1S.M1, JL I M , NN. M , NTAPE 

REDUCMC9 

COMMON  SELECT, IBID, IRIDS 

REDUCMOB 

EQUIVALENCE (LI . NNN ) 

REDUCM09 

L I  *0 

REDUCM l o 

DO  95  1*1  ,  ITOTAI 

REDUCM 1 1 

IF (LOT ( 1 ) 195*91 ,95 

1 

91 

L ! *L I + 1 

REDUCM  j  3 

B  (L  !  I  =S ( I «NNL> 

REDUCM  14 

BB  ( L  I  1  =  S  (  I  ,  NNL  1 

REDUCM 15 

J*L!-1 

REDUCM 16 

DO  200  L*  I » I  TOTAL 

REDUCM  1  7 

IF (LOT (L 1 >200,203,200 

REDUCM1B 

203 

J-J+l 

REDUCMj  9 

AKP (  J.L I 1«S ( I »L ) 

REDUCM20 

AKP (LI , J)«S< I ,L) 

REDUCM21 

A (  J,L I  >=S( I ,L  1 

REDUCM22 

A (L I , J 1 =S ( I ,L  1 

REDUCM23 

200 

CONTINUE 

QEDUCM24 

95 

CONTINUE 

REDUCM25 

N-LI-t 

REDUCM26 

IF (ERROR  1219, 219,220 

REDUCM27 

219 

PRINT  594.PGLB 

REDUCM2S 

594 

FORMAT ( 1 H 1 , I0A0) 

REDUCM29 

PRINT  5760, SELECT, (L0T( I > ♦ I ■ 1 .NNNSAV ) 

REDUCM30 

5760 

FORMAT (32H0 INDEPENDENT  VARIABLE  SELECTION  ,A8,lX,51Il) 

REDUCM31 

ICASE« ICASE+1 

REDUCM32 
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RCDUCW33 

BEDUCM34 
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